0% found this document useful (0 votes)
55 views13 pages

Compactness Theorem & Ultraproducts

The document discusses advanced mathematical concepts including the Compactness Theorem, ultraproducts, and definable sets, providing a self-contained account of the material. It explains the Compactness Theorem's implications for infinitesimals in non-standard real structures and introduces ultraproducts as a method to construct such structures. Additionally, it outlines the properties of filters and ultrafilters in the context of product structures in mathematics.

Uploaded by

smithojack03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views13 pages

Compactness Theorem & Ultraproducts

The document discusses advanced mathematical concepts including the Compactness Theorem, ultraproducts, and definable sets, providing a self-contained account of the material. It explains the Compactness Theorem's implications for infinitesimals in non-standard real structures and introduces ultraproducts as a method to construct such structures. Additionally, it outlines the properties of filters and ultrafilters in the context of product structures in mathematics.

Uploaded by

smithojack03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Contents

1 The Compactness Theorem v.1 1

2 Ultraproducts 2
2.1 Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Definable sets 6

4 Los’ Theorem 7

5 Some proofs 8

6 A few final comments 10

7 Exercises 12

These notes have things, like details and sentences, which might be missing
from what I write on the board/show on beamer slides. It is meant to be a fairly
complete and self-contained account of the material that I will talk about.

1 The Compactness Theorem v.1

Question 1.1. Is 0.9 = 1.0?


Consider the difference  = 1 − 0.9; what can we say about it? It’s easy to
see that we must have  < 1/n for every positive integer n but why shouldn’t
we be able to add the condition  > 0? We can appeal to the Compactness
Theorem.

Theorem 1.2. (Compactness Theorem v.1) If you want something and there’s
no reason you can’t have it, then you can get it.
(There is some small print; we’ll get to that.)
The Compactness Theorem applies to our question about : let’s consider
the set { < 1/n : n ∈ Z+ } ∪ { > 0} of conditions on . If we take any finitely
many of these, then there is a solution in the reals R, so “there’s no reason we
can’t have” . Admittedly the Completeness Property for the reals does exclude
there being a solution in R but, and this is part of the small print, we might
have to move to get the “something” in the theorem.
What that means in this example is that there is a structure - a non-standard
version, R∗ , of the reals - which has an infinitesimal (a solution to all those
conditions). This structure R∗ will share a great many properties with the reals
and it will have a copy of the reals sitting nicely inside it. But it will contain
an infinitesimal, with all that that implies. For example, since R is an ordered
field, so also will be R∗ , so R∗ will contain elements, such as −1 , greater than
every integer and it will contain many infinitesimals,  + , 2 , 3 , ... But in R∗ ,
we will have 0.9 < 1 −  < 1, giving an alternative to the standard answer to
the original question.

1
The Compactness Theorem, properly stated, says that if we have a set of
conditions (of a certain form, which we will discuss later) such that every finite
subset has a solution in some structure M , then there will be an “elementary
extension” M ∗ of M which contains a simultaneous solution to all the conditions.
Of course it might be that the original structure M already contains a solution
but it may be that, as in our example the original structure contains no solution
and we do have to move to an elementary extension.
The definition is: M ∗ is an elementary extension of M (equally, M is
an elementary substructure of M ∗ ) if, whenever ϕ(x) is a formula (in n
free variables and parameters from M ) then the solution set ϕ(M ) in M is the
intersection of M n with the solution set ϕ(M ∗ ) in M ∗ . I’ve not defined what
is meant by a formula (with parameters) but maybe this gives some flavour of
the idea at this stage. Here’s an example as illustration. Can −1 be a square
in R∗ ? No, because, if so, then it would be in the solution set in R∗ of the
formula ∃y (y 2 = x). So it would be in the intersection of that solution set with
R, hence in the solution set of the same formula in R - which is not the case.
Here, ∃y (y 2 = x) is an example of a “formula”; an example of a formula with
parameters is “x3 = π and ∀y (y 3 = π → y = x)”.
Formulas with no free variables are just statements (or “sentences”) and
if M is an elementary substructure of M ∗ then M and M ∗ must satisfy the
same sentences. For instance R∗ will be densely ordered because R satisfies
the sentence “∀y, z (y < z implies ∃w (y < w and w < z)” which expresses the
“densely ordered” property.
Of course, the “formulas” and “sentences” above mean something precise -
they can’t just be any old statements. We’ll come to that.

2 Ultraproducts
A non-standard version of the reals containing an infinitesimal can be con-
structed using ultraproducts. These can also be used to give a proof of the
Compactness Theorem (which we will state properly after we have seen all the
concepts needed to express it precisely). Let’s introduce ultraproducts via a
simpler kind of example: we will take one copy of each of the finite prime fields
(the integers mod p, Zp = Z/pZ, where p is prime) and produce a new field
from them. But first we need to define products of structures. Likely you have
seen the construction of the product of two groups but maybe not of a possibly
infinite collection of groups (or rings or ...). Just for simplicity of exposition we
will suppose there are countably many structures to put together into a product
- so that we can use the positive integers to index them.

2.1 Products
Suppose that M1 , M2 , . . . , Mi , . . . (that is, {Mi : i ∈ Z+ }) are structures all
of the same kind (all groups, or rings, Q∞ or partially ordered sets, or...). Their
product is, as a set, the product i=1 Mi = M1 × M2 × · · · × Mi × . . . of their
underlying sets. The elements of this set are the sequences a = (ai ) = (ai )i =
(ai )i∈Z+ = (a1 , a2 , . . . , ai . . . ) (to show the plethora of notations that we will use
for the same thing) with ai ∈ Mi for every i. The structure is defined on this set
pointwise. For instance, if we have a binary operation, denoted + say, in each

2
structure (to beQmore precise, +i is the operation in Mi ), then we define the
operation + on i Mi by (ai )i + (bi )i = (ai + bi )i (more precisely, (ai +i bi )i ).1
Applying this to our prime fields (with Q Mi the integers mod pi where pi
is the ith prime), we get the structure p Zp where p ranges over the primes.
The structure on this is that of a ring: there’s an addition and a multiplication,
both defined coordinatewise, an identity 1 = (1p )p for the multiplication and
an identity 0 = (0p )p for the addition, where 1p denotes the image of 1 ∈ Z in
Zp and similarly for 0. For example, 1 + 1 = (02 , 23 , 25 , 27 , 211 , . . . ), 1 + 1 + 1 =
(12 , 03 , 35 , 37 , 311 , . . . ), etc. (using a hopefully self-explanatory notation).
This makes the product into a commutative ring. For example, to check
commutativity of the multiplication: (ai )i × (bi )i = (ai ×i bi )i (by definition
of the structure on the product) = (bi ×i ai )i (since each component structure
is commutative) = (bi )i × (ai )i (by definition)). It is not, however, a field:
e.g. (1, 0, 0, 0, . . . ) × (0, 1, 0, 0, . . . ) = (0, 0, 0, 0, . . . ).
We can get a field from this product by collapsing elements2 . By which I
mean: define an equivalence relation on the product, form the set of equivalence
classes and induce a structure on that set (rather as we do in forming Z5 from
Z). Because the additive group structure is there, so we have cosets, it’s actually
enough to specify which elements get collapsed with 0 but, to keep displaying
the general in the specific, we’ll not use that.
The idea is that we collapse (declare to be equivalent) elements which agree
on a “large” set of coordinates.
What should we mean by a “large” set of coordinates? To recap: suppose
that I is an infinite3 index set (in our examples we always take I = Z+ but the
general case is just as easy). Suppose that, for each i ∈ I, we have a structure Q Mi
(and all these structures are of the same kind). We form the product i∈I Mi
and we will identify/collapse elements which agree on a large set of coordinates.
How do we decide which subsets of I should count as “large”?
Certainly I itself should be a large subset (equal elements should be identi-
fied) and the empty set ∅ should not be (collapsing all elements together would
not give an interesting result). If J ⊆ I is large and J ⊆ K ⊆ I then surely
K should also be large. If we’re going to identify a and b and also identify b
and c then we’re going to have to identify a and c (“identification” will be an
equivalence relation). If J = {i ∈ I : ai = bi } and K = {i ∈ I : bi = ci } are
the, “large”, sets where these pairs of elements agree, then all we can really say
about the set of coordinates where ai = ci is that it contains J ∩ K; so it looks
as if we should require this set to be large. Let’s extract those conditions.
Definition 2.1. If I is a set then a filter on I is a collection F of subsets of
I such that:
• I ∈ F;
•∅∈ / F;
• if J ⊆ K ⊆ I and J ∈ F then K ∈ F;
1 In the example that we’re about to consider, there will be two binary operations, + and

×, so you can see that we really should be a bit more precise about how we specify what we
mean by “structures all of the same kind”. That can be done - see any introduction to model
theory - but we won’t need to do it in the example-based treatment here.
2 If you know some commutative algebra then you will know the process about to be

described is equivalent to factoring by a maximal ideal, which you will likely spot in the
subsequent discussion.
3 Finite index sets will give nothing new.

3
• if J, K ∈ F, then J ∩ K ∈ F.
Note that, as a consequence of these clauses, if a subset J ⊆ I is large then
its complement J c = I \ J cannot be large.
A filter F is, in fact, enough
Q to do a collapsing: we define an equivalence
relation ∼ on the Q product i∈I M i by (ai )i ∼ (bi )i iff {i ∈ I : ai = bi } ∈
F. Denote by i∈I Mi /F the set of Q equivalence classes, writing a/ Q∼ for the
equivalence class of any element a ∈ i∈I Mi . We can then turn i∈I Mi /F
into a structure of the same kind as the Mi , defining operations and relations
pointwise. This structure is called the reduced product of the Mi (with
respect to the filter F). If all the structures Mi are copies of the same structure
M then we use the notation M I /F and refer to this as a reduced power of
M.
Let’s do this with the field example above, taking F to be the set of cofinite
subsets of the set of primes (those with a finite complement in the index set).
We can then define the algebraic operations by setting (ap )p / ∼ +(bp )p / ∼
= (ap + bp )p / ∼ and (ap )p / ∼ ×(bp )p / ∼ = (ap × bp )p / ∼. It has to be
checked that this is well-defined (e.g. that if (a0p )p ∼ (ap )p and (b0p )p ∼ (bp )p
then (a0p + b0p )p ∼ (ap + bp )p ) but the conditions in the definition of a filter
include what we need to do this. We can then check that (0p )p / ∼ is the zero
for addition and (1p )p / ∼ is the identity for multiplication
Q and, indeed, that all
the axioms for a commutative ring are satisfied by p Zp /F. However, it’s still
not a field: split the primes into two infinite disjoint subsets J and K. Define
the element a to be 1 on the indices in J and 0 on those in K; define b vice
versa. Their product is 0 but neither is 0, so this ring is not even an integral
domain, let alone a field.
Q Q
Exercise 2.2. Prove that the map p Zp → p Zp /F is a surjective homomor-
phism of rings and identify its kernel.
So we need to go further, and impose a further condition on a filter.
Definition 2.3. An ultrafilter U on a set I is a filter on I which satisfies the
further equivalent conditions:
• for each J ⊆ I either J ∈ U or I \ J ∈ U;
• if J ∪ K ∈ U then either J ∈ U or K ∈ U;
• U is a maximal filter (meaning that no collection of subsets of I can be a filter
and properly include all the sets in U).
So an ultrafilter splits the subsets of I into “large” ones (those in U) and
“small” ones (those not in U, equivalently, whose complement is in U). “Small”
does not really mean small (say in the sense of cardinality), just small according
to U. But we do have that the union of two “small” sets is still “small” (this is
the second property above).
Let’s continue
Q our example using an ultrafilter and check that the quotient
ring F = p Zp /U, is a field: if a/ ∼ = (ap )p / ∼ =6 0 then Z = {p : ap = 0} ∈ / U.
Since we have an ultrafilter, it follows that Z c = {p : ap 6= 0} ∈ U. But whenever
ap 6= 0, ap has an inverse, bp say. Set b = (bp )p (where, if p ∈ Z, set bp = 0 say).
Then ab/ ∼ = 1 so a/ ∼ has a multiplicative inverse, b/ ∼, as required.
In this case, where we collapse using an ultrafilter rather than any old filter,
we use the terms ultraproduct and ultrapower.
There is one type of ultrafilter that is not interesting. Suppose that i0 ∈ I.
Set U(i0 ) = {J ⊆ I : i0 ∈ J}. Then, exercise, U(i0 ) is an ultrafilter, called the

4
principal ultrafilter generated by i0 . When it comes to the ultraproduct
Q
i∈I Mi /U(i0 ) all that matters is what happens at the coordinate i0 and, an-
other exercise, this ultraproduct is isomorphic to Mi0 , so we got nothing new
from the construction. Therefore we will consider only non-principal ultrafilters.
But first - are there any?
If I is any infinite set then the collection of all cofinite sets is a filter, some-
times called the Fréchet filter F0 . If U is any ultrafilter containing F0 then
U cannot be principal, and conversely (quick exercise). We do need to call on
Zorn’s Lemma to give the existence of a maximal=ultra filter containing any
given filter but that can be proved (and is an exercise for those who have seen
Zorn’s Lemma).
In the special case where all the ‘component’ structures are the same, M , say,
there is a natural map from M to any reduced product, M ∗ = M I /F, given
by taking a ∈ M to (a)i / ∼ (the equivalence class of the constant sequence
(a)i ). It’s easy to check that this is an embedding of M into M ∗ , termed the
diagonal embedding. So M ∗ has a copy of the original structure sitting inside
it. Nothing like this is the case when the components are all different - we saw
an ultraproduct of finite fields giving a field of characteristic 0 (which cannot,
therefore, contain any finite field).
Exercise 2.4. Take the index set I to be the set of positive integers and, for each
i, set Mi to be the ring of integers. Let U be any non-principal ultrafilter on I.
Consider the element p = (pi )i / ∼ where pi is the ith prime. Show that p is a
prime element of Z∗ = ZI /U and is not equal to any standard prime (thinking
of those as sitting inside the diagonal copy of Z in Z∗ ).
In contrast, if we let bi be the product of the first i primes, then b = (bi )i / ∼
has every prime as a factor - so we have an element which has infinitely many
prime divisors (exercise; and, further exercise, show that it has a prime divisor
different from any standard prime).
You might wonder whether every element of Z∗ , apart from ±1, must have
at least one prime divisor. The, not so obvious, answer is “yes”; this follows
directly from Los’ Theorem.
Theorem 2.5. (Los’ Theorem v.1) If all the component structures (or even just
a “large” set of them) have a certain property4 , then their ultraproduct also has
that property.
That’s why the ultraproduct of fields is a field, not just a ring. And also
why our example above is infinite: because, given any number N , all but finitely
many, hence a “large” set of, components have cardinality greater than N . By
Los’ Theorem, the ultraproduct must have cardinality greater than N . That is,
for every N , the ultraproduct is infinite.
Relations on ultraproducts: I said how we define the algebraic structure
on an ultraproduct; what if there are some relations around, for example an
ordering <? We follow the same idea, defining (ai )i / ∼ < (bi )i / ∼ iff
{i : ai < bi } ∈ U (using notation as before). Similarly for other types of
relation - roughly, elements in the ultraproduct are in a relation if the set of
coordinates where their components are in that relation is “large” (meaning, is
in the ultrafilter).
4 There is some, very important, small print here: the property must be expressible in terms

of definable sets, equivalently in some appropriate formal language

5
Finally, let’s come back to infinitesimals. Recall that the requirement on an
infinitesimal is that it should be a solution to the set of conditions:
0 < x and, for every positive integer n, nx < 1.
We’ll take our index set I to be the set of positive integers. For each n ∈ I,
take the structure Mn to be the reals R. Choose (rather, apply Zorn’s Lemma
to get) a non-principal ultrafilter U and form the corresponding ultrapower, R∗ .
This contains the  diagonally-embedded copy of the reals. Consider the element
 = (1/n)n / ∼ ∈ R∗ . Each component is > 0 so, from the way we define the
ordering in the ultraproduct,  > 0. Also, given a positive integer n, for all but
finitely many k, the kth component of  is < 1/n; so, again by the definition
of the relation in the ultraproduct,  < 1/n. Thus we have our infinitesimal, ,
but we had to move to some “non-standard” version R∗ of the reals to get it.

3 Definable sets
We have seen that, in an ultraproduct, equations, like commutativity for the
multiplication in a ring, and conditions built from equations (like having an
inverse with respect to multiplication) somehow reflect those from its component
structures. This is the key to making the precise statement of Los’ Theorem.
So we will look at solution sets of equations - these are examples of ‘definable
sets’. But we will also consider sets of solutions of more complicated (than
equations) conditions - conjunctions of equations, inequations, even projected
(to some components) solution sets of equations. In fact, if we take a structure
M and consider the collection of subsets of powers of M which are solution
sets of equations (in any finite number of unknowns), and then close under the
operations that we just mentioned (forming finite intersections, complements,
and projections - but in general repeatedly, to get increasingly complicated sets),
then we obtain the definable sets. Let’s give a more formal definition.
Definition 3.1. Let M be a structure. Let x1 , . . . , xn be variables (“unknowns”)
and let t, t0 be two terms built up from these variables, using the algebraic op-
erations and also allowing the constants, if there are any, to appear. We write
t(x) to emphasise the variables we chose. We refer to t = t0 as an equation and
we define its solution set to be {a ∈ M n : t(a) = t0 (a}.
Example 3.2. Suppose that we take a field K for our structure (with the ring
operations, +, × and constants 0, 1). Then a term built from variables x =
(x1 , . . . , xn ) is essentially a polynomial, with integer coefficients, in those vari-
ables. So, if p, p0 are such terms/polynomials, then the solution set of the
equation p = p0 is the subset of K n consisting of all a where p and p0 take the
same value. Another way of saying this is that the solution set is the zero-set
of the polynomial p − p0 . (You may know these already as subvarieties of affine
space over K - so these are definable subsets of K or, to say it better, subsets of
K n definable in K. As I’ve said it, these would just be the subvarieties defined
over - that is, zero-sets of polynomials with coefficients in - the prime field, Q
or some Zp . To get more general subvarieties we should allow elements of K to
appear as parameters in our formulas; then they can refer to polynomials with
coefficients in K.)
Definition 3.3. Suppose first that M is a purely algebraic structure (meaning
the “structure” is given by operations and constants - no relations). The de-

6
finable subsets of (the various finite powers of ) M are the sets obtained as
follows:
• the solution set of every equation t = t0 is a definable subset;
• the complement, M n \ D of any definable subset D of M n is definable;
• the intersection of any two definable subsets of M n is definable (therefore, in
view of the previous clause, their union also is definable);
• if D is a definable subset of M n and i is any of {1, . . . , n} then the image of
D under projection along the ith axis, that is {(a1 , . . . , ai−1 , ai+1 , . . . , an ) : ∃a ∈
M with (a1 , . . . , ai−1 , a, ai+1 , . . . , an ) ∈ D}, is a definable subset of M n−1 .
If the structure M also has relations then we just add those in at the begin-
ning, along with the solution sets of equations. This does make sense, since an
n-ary relation on a set M is, formally, a subset of M n . For instance, a partial
order “<” is treated formally as a set of pairs - exactly those pairs (a, b) with
a < b. So {(a, b) ∈ M 2 : a < b} would be one of the basic definable subsets.
Every definable set has a definition. If you have done a course on Predicate
Logic, you will know that the definitions are given by formulas of a formula
language, that is, the definable sets are those which are solution sets of formulas
in the predicate language appropriate for M . Even if not, you might note
that the operations we used are the “boolean” ones, of complement (“not”)
and intersection (“and”) (and hence union (“or”)), together with existential
quantification = projection (“there exists”) (and hence, using complements,
universal quantification (“for all”)).
Exercise 3.4. Convince yourself, through a variety of examples, that if D is a de-
finable subset of a structure M then there is a formula ϕ, with free=unquantified
variables (among) x1 , . . . , xn , such that D is the solution set in M of ϕ. Further-
more, ϕ has the advantage that it can be applied to any structure that is of the
same kind as M and its meaning will be ‘the same’, meaning that it expresses
the same property (though, of course, its solutions will be very different).
So, from now on we will use formulas when we talk about definable sets
in more than one structure. For instance, the centre of a group is defined by
∀x2 (x1 ∗ x2 = x2 ∗ x1 ), where ∗ denotes the operation in the group). So the
centre of any group G is a definable subset of G but the formula ϕ(x1 ) that we
just wrote down is more general - it can be read in any group and its solution
set, which we denote ϕ(G), is exactly the centre of that group.

4 Los’ Theorem
Los’ Theorem is about what is true in an ultraproduct. It says that an element
of an ultraproduct has a certain property (that can be expressed in terms of
definable sets) iff this is true for a “large” set of its components. Furthermore
an ultraproduct has a certain property (which can be expressed by sentences of
a suitable language) iff “most” of its coordinate structures have that property.
Theorem 4.1. (Los’ Theorem v.2.1) Suppose that M ∗ = i∈I Mi /U is an
Q
ultraproduct. Suppose that ϕ is a formula (with free variables x1 , . . . , xn ). Then
a = (a1 , . . . , an ) is in the solution set, ϕ(M ∗ ), of ϕ in M ∗ , iff {i ∈ I : ai ∈
ϕ(Mi )} is in U, where aj = (aji )i / ∼ and ai = (a1j , . . . , anj ).
Returning to infinitesimals, definable subsets of R include solution sets to
equations and inequalities, so the requirements that characterise an infinitesimal

7
are fine (in that each can be expressed by a suitable formula - say in the language
of ordered rings). And any finitely many of these requirements can be satisfied
in R (that is, the intersection of the corresponding finitely many definable sets
is nonempty). So the construction that we gave earlier is just a special case of
Los’ Theorem.
Notice that, if U is a non-principal ultrafilter, hence every cofinite subset
of the index set is in U, an element a of the ultraproduct will have a property
which can be expressed by a formula ϕ(x) provided some ∼-representative of a
has all but finitely many of its components satisfying that property. Although
this condition is sufficient, it is by no means necessary for a to have the given
property but I mention it explicitly because it often is used in particular cases.
We’re not done with stating Los’ Theorem yet. The version above applies
to properties of elements, but what about properties of the whole structure?
For instance, the property that a commutative ring is a field. That particular
property can be expressed by a formula, ∀x (x 6= 0 → ∃y (xy = 1)), which
has no free variables. Such a formula is said to be a sentence. Sentences
express properties of structures, rather than elements. Here’s the version of
Los’ Theorem that applies to them.
Theorem 4.2. (Los’ Theorem v.2.2) Suppose that M ∗ = i∈I Mi /U is an
Q
ultraproduct.
(a) Suppose that ϕ is a formula (with free variables x1 , . . . , xn ). Then a =
(ai )i / ∼ is in the solution set, ϕ(M ∗ ) of ϕ in M ∗ , iff {i ∈ I : ai ∈ ϕ(Mi )} is in
U.
(b) Suppose that σ is a sentence. Then σ is true in M ∗ iff {i ∈ I : σ is true in Mi }
is in U.

To emphasise: “formula” and “sentence” have very precise meanings here -


they refer to formulas constructed from the basic algebraic relations, constants,
and relations which give meaning to the phrase “type of structure” and where
“constructed” means constructed using the boolean operations and quantifiers
(the operations that we use when constructing definable sets). Of course, there
has to be some kind of restriction: we know, for instance, that the condition
“there are only finitely many elements” cannot be expressed by such a sentence
(otherwise we could make an ultraproduct which would contradict Los’ Theo-
rem). (On the other hand, saying “there are no more than N elements” is, for
any particular natural number N , certainly expressible in any language - all we
need is equality to say that.)

5 Some proofs
Let’s sketch the proof of Los’ Theorem and derive the Compactness Theorem
from it. We’ll do the latter first.

Theorem 5.1. (Compactness theorem, v.2a) Suppose that we have a set Σ of


sentences in a language appropriate for some specific type of structure. Suppose
that, for every finite subset S of Σ, there is a structure MS of that kind which
satisfies all the sentences in S. Then there is an ultraproduct of the MS which
satisfies all the sentences in Σ.

8
Proof. We take the index set I to be the set of finite subsets S of Σ, with the
structure being indexed by S being MS . Any old ultrafilter won’t do, so we
first set up a filter (really a ‘filter basis’) by taking, for each S ∈ I, the subset
hSi = {S 0 ∈ I : S ⊆ S 0 } of I. Note that if S, S 0 ∈ I then hSi ∩ hS 0 i ⊇ hS ∪ S 0 i.
It follows (exercise) that the set F = {J ⊆ I : J ⊇ hSi for some S ∈ I} of
subsets of I which contain some set of the form hSi, is a filter. So call on Zorn’s
Lemma to bring an ultrafilter Q U on I (necessarily non-principal) containing F.
Form the ultraproduct M ∗ = S∈I MS /U. I claim that this does the job.
So take any sentence σ ∈ Σ. Then {σ} ∈ I, so h{σ}i = {S ∈ I : σ ∈ S} is in
the filter base, hence in F, hence in U. Note that if S ∈ h{σ}i then MS satisfies
σ. Therefore the set of indices S where the structure MS satisfies σ is in U and
hence, by Los’ Theorem, M ∗ satisfies σ. As required. 

Theorem 5.2. (Compactness theorem, v.2b) Suppose that M is a structure and


that ϕ is a set of formulas with free variables (among) x = (x1 , . . . , xn ) (in a
language appropriate for M ). Suppose that, for every finite subset S of ϕ, there
is a ∈ M n which satisfies all the formulas in S. Then there is an ultrapower
M ∗ of M and c ∈ (M ∗ )n which satisfies all the formulas in ϕ.

Proof. The proof is quite similar to that above. In fact, it can be made into
a special case by introducing n new constant symbols to replace the variables
x1 , . . . , xn , so that a formula ϕ(x) can be replaced by a sentence in this slightly
enriched language, thus replacing ϕ by a set of sentences, which can then be fed
into the version above. 

Sketch proof of Los’ Theorem: So suppose that M ∗ = i∈I Mi /U is an


Q
ultraproduct.
The assertion is that, given a formula ψ,
(∗) for all a, we have a ∈ ψ(M ∗ ) iff {i ∈ I : ai ∈ ψ(Mi )} is in U
where x = (x1 , . . . , xn ), a = (a1 , . . . , an ) ∈ (M ∗ )n and aj = (aji )i / ∼.
This is proved “by induction on the complexity of ψ”. This is “complexity” in
the sense that equations are the least complex formulas (so that starting point of
the induction) and then “complexity” is increased each time we apply a boolean
operation (“and”, “or”, “not”, “implies”) or a quantifier (“there exists”, or “for
all”). Because “not”, “and” and “there exists” are enough to define the others,
we only need to consider those. So we have the base cases - where ψ is an
equation - and three types of induction step. Here are the statements that,
therefore, have to be proved.
If t and t0 are terms built from variables y = (y1 , . . . , ym ) (and perhaps
constants) and b = (b1 , . . . , bm ) ∈ (M ∗ )m where bk = (bki )i / ∼, then t(b) = t0 (b)
iff {i ∈ I : t(bji ) = t0 (bji )} ∈ U. This statement, in turn, has to be proved by
induction on complexity of terms (how they are built up from the variables and
constant symbols by successively applying function symbols). Doing this is left
as an exercise.
That’s the base case; the induction steps have the following (three) forms.
If ψ and ψ 0 are formulas and if each of these satisfies (∗), then so does the
conjunction ψ ∧ ψ 0 [the proof uses the closure of U under intersections].
If ψ is a formula which satisfies (∗), then so does the negation ¬ψ (this proof
of this uses that U is actually an ultrafilter).

9
These two cases are left as an exercise. I’ll do the third here.
If ψ is a formula which satisfies (∗) and y is a variable then ∃y ψ satisfies (∗)
(it doesn’t matter whether or not y actually appears in ψ, though it’s rather
pointless to stick ∃y in front if y doesn’t appear in ψ). Let’s look at this one
more closely (you might guess that this will use the “closed upwards” property
of filters; let’s see). Suppose then that a ∈ (∃y ψ(x, y))(M ∗ ). Then there is
b ∈ M ∗ such that (a, b) ∈ ψ(M ∗ ). So, by the induction hypothesis, {i ∈ I :
(ai , bi ) ∈ ψ(Mi )} ∈ U. Now, this set is certainly contained in {i ∈ I : a ∈
(∃y ψ(x, y))(Mi )}, so this set is in U, proving one direction of (∗) (and, indeed,
using the upwards-closed property). We must check the other direction.
So suppose K = {i ∈ I : a ∈ (∃y ψ(x, y))(Mi )} ∈ U. For each index i in
this set, choose a “witness” to the existential quantifier. That is, choose some
ci ∈ Mi such that (ai , ci ) ∈ ψ(Mi ). Define the element c ∈ M ∗ to be (ci )i / ∼
where, for i ∈ I \ K, you can choose ci to be any element in Mi (since I \ K
is a “small” set, it doesn’t matter what happens on any of the corresponding
components). So, {i ∈ I : (ai , ci ) ∈ ψ(Mi )} ∈ U and so, by the inductive
hypothesis, (a, c) ∈ ψ(M ∗ ). Therefore a ∈ (∃y ψ(x, y))(M ∗ ), as required.

6 A few final comments


In case you were wondering, the term “compactness” does refer to the notion
from topology (which is that a subset of a topological space is compact if every
open cover of it has a finite subcover). The topological space in this case is
the space of “types” (roughly, descriptions of actual or potential elements).
The “conditions” that appear in the Compactness Theorem (and that are given
by formulas) define closed subsets of this space of types, and it follows from
the topological compactness property that, if the intersection of a collection of
closed sets is empty (corresponding to an infinite set of conditions having no
solution anywhere), then already the intersection of some finite subset is empty
(corresponding to a finite reason why there should be no solution).
Ultraproducts were more commonly used in the past than now, having been
replaced by realising types in (perhaps “saturated”) elementary extensions.
These are quicker to use and rather less explicit than ultraproducts - often one
does not need the level of detail which goes into constructing an ultraproduct.
But in some applications the extra detail/control is useful. Also ultraproducts
probably appeal more to people who are not familiar with mathematical logic
since the ultraproduct construction is essentially an algebraic one.
Ultraproducts generally produce large structures: it is the case that, pro-
vided the ultrafilter is (non-principal and) not ω1 -complete, any ultraproduct
using it will be either finite or have cardinality at least 2ℵ0 . An ultrafilter U is
T∞ if, for every countable set I1 , I2 , . . . , In , . . . of sets in U,
said to be ω1 -complete
their intersection n=1 In also is in U. Existence of an ω1 -complete ultrafilter
is independent of (ZFC) set theory. Meaning you’re not going to bump into one.
See [5, p.456, Exercise 4] (which I was led to by the discussion at [7]).
Model theory is a large subject. Very roughly, there is “pure” model the-
ory (where very general, core ideas are developed) and “applied” model theory,
meaning the model theory of this, that and the other, referring, for instance,
to various parts of algebra, analysis (both real and complex), algebraic (and

10
other types of) geometry, number theory, graph theory. (My own work mixes
algebra, model theory and category theory in the algebraic context of mod-
ules/representations of quivers and algebras.)
I have not been very precise about (formal predicate) languages, about what
goes into them and exactly how to define terms and formulas. Any introduction
to model theory or predicate logic should deal with this. For one example
among many, there is the “Appendix on Languages and Structures” which is
part the course notes for the model theory course that’s on my teaching webpage:
www.maths.manchester.ac.uk/∼mprest/teaching.html
I give a few references below, just as some interesting starting points for
reading around.

References
[1] James Ax, The elementary theory of finite fields, Ann. Math., 88 (1968),
239-271.
[2] John Bell and Alan Slomson, Models and Ultraproducts, North-Holland,
1969. [certainly not up-to-date but a readable introduction to various
things]
[3] Chang and Keisler, Model Theory (various editions). [It’s probably illegal
to have a model theory bibliography which omits this classic text. By the
way, Keisler’s name is pronounced as if it were “Kiesler”.]
[4] Greg Cherlin, Model Theoretic Algebra; Selected Topics, Springer Lec-
ture Notes in Mathematics, Vol. 521, 1976.
[5] Wilfrid Hodges, Model Theory, Cambridge University Press, 1993.
[6] Dave Marker, Model Theory; an Introduction, Graduate Texts in Math-
ematics, Vol. 217, 2002.

[7] Mathoverflow question “Can an ultraproduct be infinite countable?”,


https://mathoverflow.net/questions/156004/can-an-ultraproduct-be-
infinite-countable, accessed 15/7/17.
[8] Terry Tao, Ultraproducts as a bridge between discrete and continuous
analysis, https://terrytao.wordpress.com/tag/ultraproducts/, blog arti-
cle accessed 15/7/17.
[9] Katrin Tent and Martin Ziegler, A Course in Model Theory (first couple
of chapters), Cambridge University Press, 2012.
[10] Lou van den Dries and Alex Wilkie, Gromov’s theorem on groups of
polynomial growth and elementary logic, J. Algebra, 89 (1984), 349-374.
[Using model theory, gives a simple proof of Gromov’s theorem on growth
of groups at the same time as extending it to a larger class of groups.]

11
7 Exercises
Some exercises should be straightforward from what has been covered in the
lectures; some will need a little, or maybe more than a little, inspiration; some
are rather hard (a couple might be very difficult without getting some hints or
searching on the web). You are encouraged to work with others, especially on
the less straightforward ones.
Q Q
Exercise 7.1. Prove that the map p Zp → p Zp /F, where F is a filter on the
set of primes, is a surjective homomorphism of rings and identify its kernel.
Exercise 7.2. Show that if, in the ultraproduct construction, we use the principal
ultrafilter generated by an index i0 , then the resulting ultraproduct is isomorphic
to Mi0 . Feel free to do this for a specific type of structure (I’ve not really given
a general definition of “structure”).
Exercise 7.3. Prove that an ultrafilter contains the Fréchet filter iff it is non-
principal.
Exercise 7.4. Use Zorn’s lemma, if you know it or, having found it, can make
sense of it, to prove that every filter is contained in an ultrafilter.
Exercise 7.5. Assuming that every filter is contained in at least one ultrafilter,
show that if F is a filter which is not an ultrafilter, then F is contained in at
least two ultrafilters.
Q Let bi be the product of the first i primes and let b = (bi )i / ∼
Exercise 7.6.
∈ Z∗ = i∈I Z / U where U is a non-principal ultrafilter. Show that every
standard prime of Z is a factor of b. Rather harder: show that b has a prime
divisor different from any standard prime. Describe a prime of Z∗ which does
not divide b.
Exercise 7.7. Show that there are countably many finite graphs up to isomor-
phism.
Exercise 7.8. Take an ultraproduct of infinitely many copies of Z2 together
with infinitely many copies of Z3 . The result is a ring, even a field. What is its
characteristic? Equivalently, how many elements are in it?
The answer is not 2 12 .
Exercise 7.9. (somewhat open-ended) Let Mn (n ≥ 1) be the first n positive
integers {0, 1, . . . , n} regarded as a (totally) ordered set, so with the relation
“<”.
Let F0 be the Fréchet filter on the Q index set I = {1, 2, . . . , n, . . . }. What

can you say about the reduced product n=1 Mn /F0 ?
Also, let U be any ultrafilter containing F0 (thatQ∞is, any non-principal ultra-
filter). What can you say about the ultraproduct n=1 Mn /U?
Exercise 7.10. (very open-ended) Investigate the relation between definable sets
and formulas in some specific example(s). Take a definable set and find a for-
mula which defines it; take a formula and check that it can be expressed using
operations on definable sets (if you’ve not encountered formulas of formal lan-
guages before you might find that some formulas which you try are not allowable
- that is, not formulas in this precise sense). Take some “natural” subset of an
example and try to determine whether it is definable.
Exercise 7.11. Every finite planar graph is 4-colourable. Deduce that the state-
ment also is true for infinite planar graphs.

12
Details of the problem: The 4 Colour Theorem for graphs says that, for any
map which can be drawn on the plane, the regions can be coloured using just 4
colours in such a way that contiguous (sharing a border) regions have different
colours. Replacing each region by a vertex and joining two vertices by an edge
iff they are contiguous, this becomes the simpler (to deal with mathematically)
statement that, for any planar graph, the vertices can be coloured, using only 4
colours, so that vertices which are joined by an edge have different colours. The
statement was (probably) proved for finite graphs. The exercise is to derive the
statement for infinite planar graphs from the statement for finite planar graphs.
The problem is not simple because you can’t just choose a finite subgraph
of G, 4-colour it, add a few more vertices and colour them, et cetera. That’s
because you might have to recolour some vertices every time you do this, so it’s
not obvious that the process can be made to “settle down”. But it follows from
the exercise that we can - the problem is to choose appropriate colourings on
the finite subgraphs of G. We can’t really expect to do that but we can let Zorn
do the choosing for us!
Exercise 7.12. Fill in the remaining steps in the proof of Los’ Theorem.
Exercise 7.13. The basic Ramsey’s Theorem (infinite version) says that if G
is a graph (undirected) which is complete (every pair of distinct vertices has
one edge between them) with each edge being coloured either red or blue, then
there is an infinite monochromatic complete subgraph. Meaning that there is
an infinite set of vertices with every edge between them being red, or every edge
between them being blue. The proof is clever and short. If I have time, I’ll give
it in the lectures. Anyway, your task is to deduce the finite version from this.
The finite version says: “Given an integer k, there is an integer n(k) such
that, if G is a complete graph with at least n(k) vertices and with every edge
coloured either red or blue, then G contains a monochromatic complete subgraph
with at least k vertices.”
You can use the statement of the Compactness Theorem, or build an ultra-
product, as you prefer.
Exercise 7.14. Take the natural numbers N as an ordered set (I leave it to you
to decide whether 0 is natural or not; it makes no difference to the exercise).
What do elementary extensions of this structure look like?
Exercise 7.15. Every polynomial map f : Cn → Cn which is injective is sur-
jective. (More generally, this applies to polynomial maps between algebraic
varieties over an algebraically closed field.) This was proved [1] by James Ax
using the Compactness Theorem (and independently proved by Grothendieck).
You can find an account of this in Greg Cherlin’s book [4] and Dave Marker’s
book [6] (it does use a significant bit of field theory, but not a huge amount).
The key point is to note this over finite fields, then deduce it for the algebraic
closures of finite fields, then use an ultraproduct (or Compactness) to get it for
an algebraically closed field of characteristic 0, then (there are a couple of ways
to do this) deduce it for C.

13

You might also like