実直苦闘 シンセリティとストラグル

On Model Theory: The Thing So Called Model

Alright, this series is a set of notes on Model Theory, which is based on the book A Course in Model Theory. Basically, there will be an article for each chapter, but this is just my plan, since I don’t know how dense the later chapters will be.

And here is the first article, about some basic things; most of them are the knowledge that one should have learnt in First Order Logic. Actually, I have another article, not published though, which concentrates on FOL, but since there is something more in the context of Model Theory, I’m going to write here once more.

Language and Structure

Let’s talk about a metaphor. Is the sentence “あ、天気がいいかな” true? No matter whether the answer is yes or no, such a judgment must go through 2 steps. Firstly, one needs to know what this sentence means, which is to say, one needs to understand this language and find the signified of each word. Moreover, one needs to learn the grammar to understand what it states, rather than what it talks about. Secondly, one needs to check the real situation: is the weather good? After that, one can make a judgment.

That is what we mainly think about when we deal with mathematical logic. We try to understand what a formula means, both the signified of symbols and the rules of deduction. And then we give a language a designated set (one should realize here that model theory is based on set theory), which we call a structure, and check the truth of a formula.

So, what is a language, or more precisely, a first order language?

First of all, we need a bunch of logic symbols, which don’t vary as the language changes, because the things we care about are mainly in mathematics. They are some variables (use any symbols you like; sometimes viv_i, sometimes x,yx,y), some operations like ¬,,,,\neg,\wedge,\vee,\rightarrow,\leftrightarrow, and two quantifier symbols ,\exists,\forall. One who has learnt propositional logic may know that they’re not all necessary since they can represent each other by some combinations. Actually, we can also just “read” from it. For example, \exists can be replaced by ¬¬\neg\forall\neg. But for the sake of convenience, we just introduce all of them here and do not analyze further until necessary.

After that, some symbols can be introduced for use in different categories. They are some constants, some functions, and some relations. Constants here are used to represent the significant elements we want in a theory, such as the identity in group theory. Functions here are used to represent the operations in a mathematical system, like ++ or ×\times. A relation is the most important thing here, which is indispensable, used to construct statements, and is also called a predicate. Sometimes, a special relation is explicitly written as ==, which always indicates equality.

Once a language is given, we can talk about structures on it. Basically, a structure is an assignment based on a set AA (we call it the universe of the structure), which interprets each constant as an element of AA, each nn-ary function as a function from AnA^n to AA, and each nn-ary relation as an nn-ary relation on AA. Usually, we use the fraktur style of a letter to represent the assignment of a set; for example, A\mathfrak{A} is the structure whose universe is AA. And we write aAa^{\mathfrak{A}} when aa is a constant symbol, to denote its interpretation.

Wait, what have we missed? Ah, the grammar; we still need grammar to construct sentences. We call each variable and each constant a term, and when we fill a function with terms, the result is also a term. And with terms, we call a relation filled by terms an (atomic) formula, and the combination of formulas with logic symbols is also a formula. With some variable xx and a formula φ\varphi, xφ\exists x\,\varphi and xφ\forall x\,\varphi are both formulas. In particular, if all the variables in a formula are constrained by quantifiers, then we call it a sentence.

This may seem to be a mess, so let’s consider an example. L=(+,)L=(+,\le) is a language with one relation, one function and no constants. A=(N,+,)\mathfrak{A}=(\mathbb{N},+,\le) is a structure of it. x+yx+y is a term. xyx\le y and x(xy)\forall x\,(x\le y) are both formulas. xy(xx+y)\forall x\,\forall y\,(x\le x+y) is a sentence.

After these are defined, we can talk about “truth”. But, given an arbitrary formula φ\varphi, we still can’t say if it’s true, because there are some variables not interpreted. So we need an assignment on variables. Suppose φ\varphi is a formula with variables x1,,xnx_1,\cdots,x_n, then we write φ(x1,,xn)\varphi(x_1,\cdots,x_n). For a1,,anAa_1,\cdots,a_n\in A, we write φ(a1,,an)\varphi(a_1,\cdots,a_n) to denote replacing the variable xix_i with aia_i if xix_i is not under the scope of some quantifier. For example, φ(x,y)=x(x=y)\varphi(x,y)=\exists x\,(x=y), then φ(1,2)=x(x=2)\varphi(1,2)=\exists x\,(x=2). Then, we can define the notation Aφ(a1,,an)\mathfrak{A}\models\varphi(a_1,\cdots,a_n) for some φ(x1,,xn)\varphi(x_1,\cdots,x_n) when

  1. (a1,,an)RA(a_1,\cdots,a_n)\in R^{\mathfrak{A}} if φ=R\varphi = R is an atomic formula,
  2. Aφ1\mathfrak{A}\models\varphi_1 and Aφ2\mathfrak{A}\models\varphi_2 if φ=φ1φ2\varphi = \varphi_1\wedge\varphi_2,
  3. Aφ1\mathfrak{A}\models\varphi_1 or Aφ2\mathfrak{A}\models\varphi_2 if φ=φ1φ2\varphi = \varphi_1\vee\varphi_2,
  4. A⊭φ1\mathfrak{A}\not\models\varphi_1 if φ=¬φ1\varphi = \neg\varphi_1,
  5. A⊭φ1\mathfrak{A}\not\models\varphi_1 or Aφ2\mathfrak{A}\models\varphi_2 if φ=φ1φ2\varphi = \varphi_1\rightarrow\varphi_2,
  6. both Aφ1\mathfrak{A}\models\varphi_1 and Aφ2\mathfrak{A}\models\varphi_2, or both A⊭φ1\mathfrak{A}\not\models\varphi_1 and A⊭φ2\mathfrak{A}\not\models\varphi_2, if φ=φ1φ2\varphi = \varphi_1\leftrightarrow\varphi_2,
  7. there exists some aAa\in A such that Aφ1(a1,,ai1,a,ai+1,,an)\mathfrak{A}\models\varphi_1(a_1,\cdots,a_{i-1},a,a_{i+1},\cdots,a_n) if φ=xiφ1\varphi = \exists x_i\,\varphi_1,
  8. for any aAa\in A, Aφ1(a1,,ai1,a,ai+1,,an)\mathfrak{A}\models\varphi_1(a_1,\cdots,a_{i-1},a,a_{i+1},\cdots,a_n) if φ=xiφ1\varphi = \forall x_i\,\varphi_1. With 7 and 8, we shall see that the assignment of variables doesn’t affect the truth of a formula if such a variable is in the scope of some quantifier. We call such variables bounded variables; otherwise we say they are free. Therefore, a formula without any free variable has a unique truth value. We call such a formula a sentence. And if a sentence is true under a structure, we say the sentence is satisfied by the structure, and we call the structure a model of the sentence.

Language and Language, Structure and Structure

One may find some relationship between languages and languages, structures and structures. For example, a language of the ring seems to contain the symbols of the group. Given a language of the group, a group is a set and its subgroup seems to be another structure for the language.

Given a language L\mathcal{L}, we say a subset of L\mathcal{L} is a sublanguage, which also forms a language. For example, (0,)(0,\le) is a language, and ()(\le) is a sublanguage of it. Clearly, a structure of some language is naturally a structure of its sublanguage, which can be obtained easily by forgetting the interpretation of those symbols that are not used.

We can expand a language manually with a structure. Consider a language L\mathcal{L} and its structure A\mathfrak{A}; if BB is a subset of AA, then we can add the elements of BB into L\mathcal{L} as new constants and interpret them as exactly what they are in AA. Such a new language is denoted by L(B)\mathcal{L}(B), and a new structure is naturally given by AB=(A,b)bB\mathfrak{A}_B=(\mathfrak{A},b)_{b\in B}.

Between two L\mathcal{L}-structures A\mathfrak{A} and B\mathfrak{B}, if there is a map h:ABh:A\rightarrow B such that

  1. h(cA)=cBh(c^{\mathfrak{A}})=c^{\mathfrak{B}}, for cc a L\mathcal{L}-constant,
  2. h(fA(a1,,an))=fB(h(a1),,h(an))h\big(f^{\mathfrak{A}}(a_1,\cdots,a_n)\big)=f^{\mathfrak{B}}\big(h(a_1),\cdots,h(a_n)\big), for ff a L\mathcal{L}-function,
  3. R(a1,,an)    R(h(a1),,h(an))R(a_1,\cdots,a_n)\;\Rightarrow\; R\big(h(a_1),\cdots,h(a_n)\big), for RR a L\mathcal{L}-relation, then we call hh a homomorphism from A\mathfrak{A} to B\mathfrak{B}. If hh is an injection and R(a1,,an)    R(h(a1),,h(an))R(a_1,\cdots,a_n)\;\Leftrightarrow\; R\big(h(a_1),\cdots,h(a_n)\big), then we say it is an embedding. Once hh is a bijective embedding, we call it an isomorphism.

Then, for two structures A\mathfrak{A} and B\mathfrak{B} with ABA\subseteq B, we say A\mathfrak{A} is a substructure of B\mathfrak{B} if the inclusion map is an embedding, denoted as AB\mathfrak{A}\subset\mathfrak{B}.

As we have seen in group theory, we can define a substructure generated from a subset. By simple analysis, one may realize that the intersection of a family of substructures is a substructure. Then for a subset SS of a structure A\mathfrak{A}, there is a smallest substructure that contains SS, which is denoted by SA\langle S\rangle^{\mathfrak{A}} and is called the substructure generated by SS. If SS is finite, then A\mathfrak{A} is called finitely generated.

Moreover, SA={tA(s1,,sn); t(x1,,xn) is a term and s1,,snS}\langle S\rangle^{\mathfrak{A}}=\big\{t^{\mathfrak{A}}(s_1,\cdots,s_n);\ t(x_1,\cdots,x_n)\text{ is a term and }s_1,\cdots,s_n \in S\big\}

The World of Theories

However, we do not really care about language when learning mathematics, right? Everything we deal with when doing maths is the sentences, which we generally call axioms, theorems and propositions.

In mathematical logic, a theory is a set of sentences, such as the axioms of groups, rings and fields.

Generally, we are going to introduce the deduction of sentences, which is quite important in the first order language. However, we won’t describe it here, since it is not the topic of Model Theory, and we only give some conclusions.

If a theory has a model of it, which means there is a structure such that every sentence in it is satisfied by the structure, it is called consistent. Consistency means there are no inner contradictions within the set of sentences. This definition is also usable on theories; for two sets of sentences Σ\Sigma and TT, we say Σ\Sigma is consistent with TT if ΣT\Sigma\cup T is consistent.

For two sentences φ\varphi and ψ\psi, if any model of φ\varphi is a model of ψ\psi, i.e., for any A\mathfrak{A}, Aφ\mathfrak{A}\models\varphi if and only if Aψ\mathfrak{A}\models\psi, then we write φψ\varphi\vdash\psi. This definition can be generalized to two theories TT and SS easily: TST\vdash S if all models of TT are models of SS. And we say TT and SS are equivalent if they have the same models, which means TST\vdash S and STS\vdash T, denoted as TST\equiv S.

Given a language LL and an LL-theory TT, we say TT is complete if for any LL-sentence φ\varphi, either TφT\vdash\varphi or T¬φT\vdash\neg\varphi. It’s quite clear that the relation TφT\vdash\varphi is independent of the language, but the completeness does depend.

Before we give a characterization of completeness, a characterization of formulas and some kinds of sets are going to be introduced first.

We call a formula basic if it is an atomic formula or it is the negation form of some atomic formula. And the formula constructed from basic formulas with the logic symbols ,,,\exists,\forall,\wedge,\vee is said to be in negation normal form.

Actually, every formula is equivalent to a formula in negation normal form. One only needs to realize that φψ\varphi\rightarrow\psi is equivalent to ¬φψ\neg\varphi\vee\psi, φψ\varphi\leftrightarrow\psi is equivalent to (¬φ¬ψ)(φψ)(\neg\varphi\wedge\neg\psi)\vee(\varphi\wedge\psi), and De Morgan’s law allows “putting” a ¬\neg into a given formula.

Some Sets

Given a formula with some free variables φ(x1,,xn)\varphi(x_1,\cdots,x_n), there is a set φ[A]={(a1,,an); Aφ(a1,,an)}.\varphi[\mathfrak{A}]=\left\{(a_1,\cdots,a_n);\ \mathfrak{A}\models\varphi(a_1,\cdots,a_n)\right\}. It is called the realization of φ\varphi. We can see this as a generalization of RAR^{\mathfrak{A}} for some RR a relation.

More generally, given a subset BB of the universe of a structure AA, for some L(B)L(B)-formula φ\varphi, φ[A]\varphi[\mathfrak{A}] is said to be a BB-definable set. In particular, the realization of φL\varphi\in\mathcal{L} is a 00-definable set (the notation 00 denotes the empty set in set theory).

For a structure A\mathfrak{A}, the atomic diagram is defined as Diag(A)={φ; AAφ and φ is a basic sentence}.\mathrm{Diag}(\mathfrak{A})=\{\varphi;\ \mathfrak{A}_{A}\models\varphi \text{ and }\varphi \text{ is a basic sentence}\}. And the theory of A\mathfrak{A} is defined as Th(A)={φ; Aφ}.Th(\mathfrak{A})=\{\varphi;\ \mathfrak{A}\models \varphi\}. Two models are called elementarily equivalent if they have the same theory, written AB\mathfrak{A} \equiv \mathfrak{B}.

With these, we can characterize the completeness of a theory. Let TT be a consistent theory; then the following are equivalent:

  1. TT is complete.
  2. All models of TT are elementarily equivalent.
  3. There exists a model A\mathfrak{A} with TTh(A)T\equiv Th(\mathfrak{A}). To prove it, we separate it into 3 parts. "121\Rightarrow2": Suppose two models A\mathfrak{A} and B\mathfrak{B} of TT. For any φ\varphi, without loss of generality, suppose TφT\vdash\varphi, then Aφ\mathfrak{A}\models\varphi and Bφ\mathfrak{B}\models\varphi. "232\Rightarrow3": If there is no model satisfying 3, since TT is consistent, there must be some A\mathfrak{A} such that Th(A)Th(\mathfrak{A}) is a proper extension of TT. Let φTh(A)T\varphi\in Th(\mathfrak{A})\setminus T, then T{¬φ}T\cup\{\neg\varphi\} is still consistent, and then the model B\mathfrak{B} of T{¬φ}T\cup\{\neg\varphi\}, which is also a model of TT, is not elementarily equivalent to A\mathfrak{A}, which contradicts 2. "313\Rightarrow1": By definition.

SS-Sorted Structure

As the closure of this chapter, we introduce the so-called SS-sorted structure. Basically, it is something assigning types to a language, which is just like what TypeScript is to JavaScript.

Formally, given a set of sorts SS whose elements are sorts, a language LL is a set of constants assigned to some sorts, functions’ domains and ranges are constrained by some sorts, and relations with sorts. The SS-sorted structure is a structure whose universe is a family (As)sS(A_s)_{s\in S} of non-empty sets, and

  1. cAAsc^{\mathfrak{A}}\in A_s,
  2. fA:As1××AsnAtf^{\mathfrak{A}}: A_{s_1}\times\cdots\times A_{s_n}\rightarrow A_t,
  3. RAAs1××AsnR^{\mathfrak{A}} \subset A_{s_1}\times\cdots\times A_{s_n}.