DM22
DM22
Departement Informatik
Diskrete
Mathematik
Ueli Maurer
Herbstsemester 2022
Das Teilgebiet der Mathematik, das sich mit diskreten Strukturen befasst,
heisst diskrete Mathematik. Der Begriff “diskret” ist zu verstehen als endlich oder
abzählbar unendlich. Viele Teilbereiche der diskreten Mathematik sind so wichtig,
dass sie vertieft in einer eigenen Vorlesung behandelt werden. Dazu gehören
die Theorie der Berechnung, also die Formalisierung der Begriffe Berechnung
und Algorithmus, welche in der Vorlesung “Theoretische Informatik” behan-
delt wird, sowie die diskrete Wahrscheinlichkeitstheorie. Eine inhaltliche Ver-
Vorwort wandtschaft besteht auch zur Vorlesung über Algorithmen und Datenstruk-
turen.
In dieser Lehrveranstaltung werden die wichtigsten Begriffe, Techniken und
Resultate der diskreten Mathematik eingeführt. Hauptziele der Vorlesung sind
Viele Disziplinen der Wissenschaft, insbesondere der Natur- und Ingenieurwis- nebst der Behandlung der konkreten Themen ebenso die adäquate Model-
senschaften, beruhen in einer zentralen Weise auf der Mathematik. Einerseits lierung von Sachverhalten, sowie das Verständnis für die Wichtigkeit von Ab-
erlaubt die Mathematik, Sachverhalte zu modellieren und damit den Diskurs straktion, von Beweisen und generell der mathematisch-präzisen Denkweise,
von einer intuitiven auf eine präzise und formale Stufe zu heben. Andererseits die auch beim Entwurf von Softwaresystemen enorm wichtig ist. Zudem wer-
erlaubt die Mathematik, wenn Sachverhalte einmal präzise modelliert sind (z.B. den einige Anwendungen diskutiert, z.B. aus der Kryptografie, der Codierungs-
die Statik als Teil der Physik), konkrete Probleme zu lösen (z.B. eine Brücke zu theorie oder der Algorithmentheorie. Diskrete Mathematik ist ein sehr bre-
dimensionieren). ites Gebiet. Entsprechend unterschiedliche Ansätze gibt es auch für den Auf-
bau einer Vorlesung über das Thema. Mein Ziel bei der Konzipierung dieser
Welche mathematischen Disziplinen sind für die Computerwissenschaften
Lehrveranstaltung war es, speziell auf Themen einzugehen, die in der Infor-
(Informatik, Computer Science) speziell relevant? Was muss in der Informatik
matik wichtig sind, sowie dem Anspruch zu genügen, keine zentralen Themen
modelliert werden? Welche Art von Problemen möchte man verstehen und
der diskreten Mathematik auszulassen. Ausnahmen sind die Kombinatorik und
lösen können? Der gemeinsame Nenner der vielen möglichen Antworten ist,
die Graphentheorie, die früher als Kapitel 4 und 5 dieses Skriptes erschienen, in
dass es in der Informatik um diskrete, meist endliche Strukturen geht. Digi-
der letzten Studienplanrevision aber in andere Vorlesungen verschoben wur-
tale Computer haben einen endlichen Zustandsraum, d.h. der Zustand ist exakt
den.
beschreibbar als eine von endlich vielen Möglichkeiten. Zwei Zustände können
nicht, wie in der Physik, beliebig ähnlich sein. Es gibt nicht das Problem, dass Die sechs Kapitel sind
reellwertige Parameter (z.B. die Temperatur) nur approximativ gemessen wer-
den können. In der Informatik im engeren Sinn gibt es keine kontinuierlichen 1. Introduction and Motivation
Grössen.1 2. Mathematical Reasoning and Proofs
Das heisst natürlich nicht, dass sich die Informatik nicht mit Themen be- 3. Sets, Relations, and Functions
fasst, bei denen kontinuierliche Grössen wichtig sind. Die Informatik ist ja auch 4. Number Theory
eine Hilfswissenschaft, z.B. für die Naturwissenschaften, wobei die Grenzen 5. Algebra
zwischen der eigentlichen Wissenschaft und der Hilfswissenschaft in einigen
6. Logic
Bereichen verschwommener werden. In Bereichen wie Computational Biology
oder Computational Chemistry werden wesentliche Beiträge direkt von der In-
Viele Beispiele werden nur an der Tafel oder in den Übungen behandelt. Die
formatik beigesteuert. In diesen Bereichen der Informatik spielen reellwertig
Vorlesung und die Übungen bilden einen integralen Bestandteil der Lehrver-
parametrisierte Systeme eine wichtige Rolle.2
anstaltung und des Prüfungsstoffes. Es gibt kein einzelnes Buch, das den
ganzen Stoff der Lehrveranstaltung behandelt. Aber unten folgt eine Liste guter
1 Die Mathematik der Informatik sollte demnach einfacher verständlich sein als die kontinuier-
Bücher, die als Ergänzung dienen können. Sie decken aber jeweils nur Teile der
liche Mathematik (z.B. Analysis). Sollte dies dem Leser ab und zu nicht so erscheinen, so ist es Vorlesung ab, gehen zum Teil zu wenig tief, oder sind zu fortgeschritten im Ver-
vermutlich lediglich eine Frage der Gewöhnung.
2 Die Numerik befasst sich unter anderem mit dem Thema der (in einem Computer unvermeid- gleich zur Vorlesung.
baren) diskreten Approximation reeller Grössen und den daraus resultierenden Problemen wie z.B.
numerische Instabilitäten. • N. L. Biggs, Discrete Mathematics, Clarendon Press.
iv
• K. H. Rosen, Discrete Mathematics and its Applications, fourth edition,
McGraw-Hill.
• A. Steger, Diskrete Strukturen, Band 1, Springer Verlag.
• M. Aigner, Diskrete Mathematik, Vieweg.
• J. Matousek, J. Nesetril, Discrete Mathematics, Clarendon Press.
• I. Anderson, A First Course in Discrete Mathematics, Springer Verlag.
• U. Schöning, Logik für Informatiker, Spektrum Verlag, 5. Auflage, 2000.
• M. Kreuzer and S. Kühling, Logik für Informatiker, Pearson Studium,
2006.
v
2.3.6 Tautologies and Satisfiability . . . . . . . . . . . . . . . . . 21
2.3.7 Logical Circuits * . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 A First Introduction to Predicate Logic . . . . . . . . . . . . . . . . 23
2.4.1 Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Functions and Constants . . . . . . . . . . . . . . . . . . . . 24
Contents 2.4.3
2.4.4
The Quantifiers ∃ and ∀ . . . . . . . . . . . . . . . . . . . .
Nested Quantifiers . . . . . . . . . . . . . . . . . . . . . . .
24
25
2.4.5 Interpretation of Formulas . . . . . . . . . . . . . . . . . . . 26
2.4.6 Tautologies and Satisfiability . . . . . . . . . . . . . . . . . 26
2.4.7 Equivalence and Logical Consequence . . . . . . . . . . . . 27
Vorwort iii
2.4.8 Some Useful Rules . . . . . . . . . . . . . . . . . . . . . . . 27
1 Introduction and Motivation 1 2.5 Logical Formulas vs. Mathematical Statements . . . . . . . . . . . 28
1.1 Discrete Mathematics and Computer Science . . . . . . . . . . . . 1 2.5.1 Fixed Interpretations and Formulas as Statements . . . . . 28
1.2 Discrete Mathematics: A Selection of Teasers . . . . . . . . . . . . 2 2.5.2 Mathematical Statements about Formulas . . . . . . . . . . 29
1.3 Abstraction: Simplicity and Generality . . . . . . . . . . . . . . . . 4 2.6 Some Proof Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.1 Composition of Implications . . . . . . . . . . . . . . . . . 29
2 Math. Reasoning, Proofs, and a First Approach to Logic 7 2.6.2 Direct Proof of an Implication . . . . . . . . . . . . . . . . . 30
2.1 Mathematical Statements . . . . . . . . . . . . . . . . . . . . . . . . 7 2.6.3 Indirect Proof of an Implication . . . . . . . . . . . . . . . . 30
2.1.1 The Concept of a Mathematical Statement . . . . . . . . . . 7 2.6.4 Modus Ponens . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.2 Composition of Mathematical Statements . . . . . . . . . . 8 2.6.5 Case Distinction . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.3 Mathematical Statements in Computer Science . . . . . . . 9 2.6.6 Proofs by Contradiction . . . . . . . . . . . . . . . . . . . . 32
2.2 The Concept of a Proof . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.6.7 Existence Proofs . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.1 Examples of Proofs . . . . . . . . . . . . . . . . . . . . . . . 10 2.6.8 Existence Proofs via the Pigeonhole Principle . . . . . . . . 34
2.2.2 Examples of False Proofs . . . . . . . . . . . . . . . . . . . 11 2.6.9 Proofs by Counterexample . . . . . . . . . . . . . . . . . . 36
2.2.3 Two Meanings of the Symbol =⇒ . . . . . . . . . . . . . . . 12 2.6.10 Proofs by Induction . . . . . . . . . . . . . . . . . . . . . . 36
2.2.4 Proofs Using Several Implications . . . . . . . . . . . . . . 12
2.2.5 An Informal Understanding of the Proof Concept . . . . . 13 3 Sets, Relations, and Functions 38
2.2.6 Informal vs. Formal Proofs . . . . . . . . . . . . . . . . . . 13 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2.7 The Role of Logic . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1 An Intuitive Understanding of Sets . . . . . . . . . . . . . 38
2.2.8 Proofs in this Course . . . . . . . . . . . . . . . . . . . . . . 15 3.1.2 Russell’s Paradox . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 A First Introduction to Propositional Logic . . . . . . . . . . . . . 15 3.2 Sets and Operations on Sets . . . . . . . . . . . . . . . . . . . . . . 40
2.3.1 Logical Constants, Operators, and Formulas . . . . . . . . 16 3.2.1 The Set Concept . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.2 Formulas as Functions . . . . . . . . . . . . . . . . . . . . . 18 3.2.2 Set Equality and Constructing Sets From Sets . . . . . . . . 41
2.3.3 Logical Equivalence and some Basic Laws . . . . . . . . . 18 3.2.3 Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.4 Logical Consequence (for Propositional Logic) . . . . . . . 20 3.2.4 Union and Intersection . . . . . . . . . . . . . . . . . . . . . 43
2.3.5 Lifting Equivalences and Consequences to Formulas . . . 21 3.2.5 The Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . 44
vii viii
3.2.6 Constructing Sets from the Empty Set . . . . . . . . . . . . 45 4.2.2 Division with Remainders . . . . . . . . . . . . . . . . . . . 72
3.2.7 A Construction of the Natural Numbers . . . . . . . . . . . 46 4.2.3 Greatest Common Divisors . . . . . . . . . . . . . . . . . . 73
3.2.8 The Power Set of a Set . . . . . . . . . . . . . . . . . . . . . 47 4.2.4 Least Common Multiples . . . . . . . . . . . . . . . . . . . 75
3.2.9 The Cartesian Product of Sets . . . . . . . . . . . . . . . . . 47 4.3 Factorization into Primes . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.1 Primes and the Fundamental Theorem of Arithmetic . . . 75
3.3.1 The Relation Concept . . . . . . . . . . . . . . . . . . . . . 48 4.3.2 Proof of the Fundamental Theorem of Arithmetic * . . . . 76
3.3.2 Representations of Relations . . . . . . . . . . . . . . . . . 49 4.3.3 Expressing gcd and lcm . . . . . . . . . . . . . . . . . . . . 77
3.3.3 Set Operations on Relations . . . . . . . . . . . . . . . . . . 50 4.3.4 Non-triviality of Unique Factorization * . . . . . . . . . . . 77
3.3.4 The Inverse of a Relation . . . . . . . . . . . . . . . . . . . . 50 4.3.5 Irrationality of Roots * . . . . . . . . . . . . . . . . . . . . . 78
3.3.5 Composition of Relations . . . . . . . . . . . . . . . . . . . 51 4.3.6 A Digression to Music Theory * . . . . . . . . . . . . . . . . 78
3.3.6 Special Properties of Relations . . . . . . . . . . . . . . . . 52 4.4 Some Basic Facts About Primes * . . . . . . . . . . . . . . . . . . . 79
3.3.7 Transitive Closure . . . . . . . . . . . . . . . . . . . . . . . 54 4.4.1 The Density of Primes . . . . . . . . . . . . . . . . . . . . . 79
3.4 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.4.2 Remarks on Primality Testing . . . . . . . . . . . . . . . . . 80
3.4.1 Definition of Equivalence Relations . . . . . . . . . . . . . 54 4.5 Congruences and Modular Arithmetic . . . . . . . . . . . . . . . . 81
3.4.2 Equivalence Classes Form a Partition . . . . . . . . . . . . 55 4.5.1 Modular Congruences . . . . . . . . . . . . . . . . . . . . . 81
3.4.3 Example: Definition of the Rational Numbers . . . . . . . 56 4.5.2 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . 82
3.5 Partial Order Relations . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.5.3 Multiplicative Inverses . . . . . . . . . . . . . . . . . . . . . 84
3.5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.5.4 The Chinese Remainder Theorem . . . . . . . . . . . . . . 85
3.5.2 Hasse Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.6 Application: Diffie-Hellman Key-Agreement . . . . . . . . . . . . 86
3.5.3 Combinations of Posets and the Lexicographic Order . . . 59
5 Algebra 89
3.5.4 Special Elements in Posets . . . . . . . . . . . . . . . . . . . 60
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.5.5 Meet, Join, and Lattices . . . . . . . . . . . . . . . . . . . . 61
5.1.1 What Algebra is About . . . . . . . . . . . . . . . . . . . . . 89
3.6 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.1.2 Algebraic Structures . . . . . . . . . . . . . . . . . . . . . . 89
3.7 Countable and Uncountable Sets . . . . . . . . . . . . . . . . . . . 64
5.1.3 Some Examples of Algebras . . . . . . . . . . . . . . . . . . 90
3.7.1 Countability of Sets . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Monoids and Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7.2 Between Finite and Countably Infinite . . . . . . . . . . . . 64
5.2.1 Neutral Elements . . . . . . . . . . . . . . . . . . . . . . . . 91
3.7.3 Important Countable Sets . . . . . . . . . . . . . . . . . . . 65
5.2.2 Associativity and Monoids . . . . . . . . . . . . . . . . . . 91
3.7.4 Uncountability of {0, 1}∞ . . . . . . . . . . . . . . . . . . . 67
5.2.3 Inverses and Groups . . . . . . . . . . . . . . . . . . . . . . 92
3.7.5 Existence of Uncomputable Functions . . . . . . . . . . . . 68
5.2.4 (Non-)minimality of the Group Axioms . . . . . . . . . . . 93
4 Number Theory 70 5.2.5 Some Examples of Groups . . . . . . . . . . . . . . . . . . . 94
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.3 The Structure of Groups . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.1 Number Theory as a Mathematical Discipline . . . . . . . 70 5.3.1 Direct Products of Groups . . . . . . . . . . . . . . . . . . . 95
4.1.2 What are the Integers? . . . . . . . . . . . . . . . . . . . . . 71 5.3.2 Group Homomorphisms . . . . . . . . . . . . . . . . . . . . 96
4.2 Divisors and Division . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3.3 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.1 Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3.4 The Order of Group Elements and of a Group . . . . . . . 97
ix x
xii
xi
1.2. Discrete Mathematics: A Selection of Teasers 2
the form of calculations (e.g. calculating the wing profile for an airplane).
In contrast, in Computer Science, mathematical reasoning often happens
in the form of a derivation (or, more mathematically stated, a proof). For
example, understanding a computer program means to understand it as
a well-defined discrete mathematical object, and making a desirable state-
ment about the program (e.g. that it terminates within a certain number
Chapter 1 of steps) means to prove (or derive) this statement. Similarly, the state-
ment that a system (e.g. a block-chain system) is secure is a mathematical
statement that requires a proof.
3. Mathematical derivations. Mathematical reasoning is essential in any en- for this condition, read as “k 2 is congruent to 1 modulo 3.” This condition is
gineering discipline, and especially in Computer Science. In many disci- equivalent to
plines (e.g.2 mechanical engineering), mathematical reasoning happens in k ≡3 1 or k ≡3 2.
1 We also refer to the preface to these lecture notes where the special role of mathematics for 3 The reader should not worry too much if he or she is not familiar with some of the concepts
Computer Science is mentioned. discussed in this section, for example the interpolation of a polynomial, computation modulo a
2 “e.g.”, the abbreviation of the Latin “exempli gratia” should be read as “for example”. number n, Euclid’s algorithm for computing greatest common divisors, or matrices.
1
3 Chapter 1. Introduction and Motivation 1.3. Abstraction: Simplicity and Generality 4
Hence we have P (k) = 0 for all k with k ≡3 0 (i.e.,4 the k divisible by 3).5 a(x)b(x) = a(x)c(x) imply b(x) = c(x) if a(x) 6= 0? Does it hold for the integers
The case k = 4 can be solved easily by finding a solution for each of the three modulo m, i.e., does ab ≡m ac imply b ≡m c if a 6= 0? Does it hold for the
types of squares (corner, edge, interior of board) that could be marked. Hence permutations, when multiplication is defined as composition of permutations?
we have proved P (4) = 1. This proof type will later be called a proof by case What does the condition a 6= 0 mean in this case? Which abstraction lies behind
distinction. the cancellation law? This is a typical algebraic question (see Chapter 5).
For the case k = 5 one can prove that P (5) = 0 by showing that there is (at Example 1.4. It is well-known that one can interpolate a polynomial a(x) =
least) a square which, when marked, leaves an area not coverable by L-shapes. ad xd + ad−1 xd−1 + · · · a1 x + a0 of degree d with real coefficients from any d + 1
Namely, if one marks a square next to the center square, then it is impossible to values a(αi ), for distinct α1 , . . . , αd+1 . Can we also construct polynomials over a
cover the remaining area by L-shapes. This proof type will later be called a proof finite domain (which is of more interest and use in Computer Science), keeping
by counterexample. this interpolation property?
We have P (6) = 0 because 6 is divisible by 3, and hence the next interesting For example, consider computation modulo 5. There are 53 = 125 polyno-
case is k = 7. The reader can prove as an exercise that P (7) = 1. (How many mials a2 x2 + a1 x + a0 of degree 2 because we can freely choose three coefficients
cases do you have to check?) from {0, 1, 2, 3, 4}. It is straight-forward (though cumbersome) to verify that if
The question of interest is, for a general k, whether P (k) = 1 or P (k) = 0. we fix any three evaluation points (for example 0, 2, and 3), then the polyno-
But one can prove (explained in the lecture) that mial is determined by the values at these points. In other words, two different
polynomials p and q result in distinct lists (p(0), p(2), p(3)) and (q(0), q(2), q(3))
P (k) = 1 =⇒ P (2k) = 1, of polynomial values. What is the general principle explaining this? For the
answer and applications, see Chapter 5.
i.e., that if the statement is true for some k, then it is also true for two times k.
This implies that P (2i ) = 1 for any i and also that P (7 · 2i ) = 1 for any i. Hence
we have P (8) = 1, and P (9) = 0, leaving P (10) and P (11) as the next open 1.3 Abstraction: Simplicity and Generality
cases. One can also prove the following generalization of the above-stated fact:
A main theme of this course is abstraction. In everyday life, the term “abstract”
P (k) = 1 and P (ℓ) = 1 =⇒ P (kℓ) = 1. has a negative meaning. It stands for non-intuitive and difficult-to-understand.
For us, abstraction will have precisely the opposite meaning. It will stand for
We point out that, already in this first example, we understand the reasoning simplicity and generality. I hope to be able to convey the joy and importance of
leading to the conclusion P (k) = 0 or P (k) = 1 as a proof. simplification and generalization by abstraction.
Example 1.2. Consider the following simple method for testing primality. Prove Indeed, abstraction is probably the most important principle in program-
or disprove that an odd number n is a prime if and only if 2n−1 divided by n ming and the design of information systems. Computers and computer pro-
yields remainder 1, i.e., if grams are highly (perhaps unimaginably) complex systems. For a computer sys-
2n−1 ≡n 1. tem with only 1000 bits of storage, the number 21000 of system states is greater
than the number of atoms in the known universe. The immense complexity
One can easily check that 2n−1 ≡n 1 holds for the primes n = 3, 5, 7, 11, 13 (and of software systems is usually grossly underestimated, resulting in potentially
many more). Moreover, one can also easily check that 2n−1 6≡n 1 for the first odd catastrophic software failures. For typical commercial software, failures are the
composite numbers n = 9, 15, 21, 25, etc. But is the formula a general primality rule rather than the exception.
test? The solution to this problem will be given in Chapter 4.
In order to manage the complexity, software systems are divided into com-
Example 1.3. The well-known cancellation law for real numbers states that if ponents (called modules, layers, objects, or abstract data types) that interact
ab = ac and a 6= 0, then b = c. In other words, one can divide both sides by with each other and with the environment in a well-defined manner. For ex-
a. How general is this law? Does it hold for the polynomials over R, i.e., does ample, the Internet communication software is divided into a number of lay-
4 “i.e.”,
ers, each with a dedicated set of tasks. The IP layer transports packets between
the abbreviation of the Latin “id est”, should be read as “that is” (German: “das heisst”).
5 Thefact that the equation k 2 ≡p 1 has two solutions modulo p, for any prime p, not just for
computers, and the TCP layer manages reliable connections. The potential com-
p = 3, will be obvious once we understand that computing modulo p is a field (see Chapter 5) and plexity of the interaction between subsystems is channeled into clearly specified
that every element of a field has either two square roots or none. interfaces. The behavior of a subsystem is described by a manageable number
5 Chapter 1. Introduction and Motivation 1.3. Abstraction: Simplicity and Generality 6
of rules. This is abstraction. Without abstraction, writing good software is im- replaced by the pair consisting of the smaller integer and the remainder of the
possible. division. This step is repeated until the remainder is 0. The greatest common
Abstraction means simplification. By an abstraction one ignores all aspects divisor is the last non-zero remainder.
of a system that are not relevant for the problem at hand, concentrating on the Essentially the same algorithm works for two polynomials a(x) and b(x), say
properties that matter. with integer (or real) coefficients, where the size of a polynomial is defined to
Abstraction also means generalization. If one proves a property of a system be its degree. In which sense are integer numbers and polynomials similar? At
described at an abstract level, then this property holds for any system with the which level of abstraction can they be seen as instantiations of the same abstract
same abstraction, independently of any details. concept? As we will see in Chapter 5, the answer is that they are both so-called
Euclidean domains, which is a special type of a so-called integral domain, which in
Example 1.5. A standard Swiss chocolate consists of 6 rows of 4 pieces each. We turn is a special case of a ring.
would like to break it into its 24 pieces using the minimal number of breaking
operations. The first breaking operation can break the chocolate in any of the
5 ways parallel to the short side, or in any of the 3 ways parallel to the long
side. Afterwards, a breaking operation consists of taking an arbitrary piece of
chocolate and breaking it along one of the marked lines. What is the minimal
number of breaking operations needed to break the chocolate into its 24 pieces?
Is it better to first break the chocolate into two equal pieces or to break off one
row? Is it better to first break along a short or a long line? Which abstraction
explains the answer? Find a similar problem with the same abstraction.
Example 1.6. Can the shape in Figure 1.1 be cut into 9 identical pieces? If not,
why? If yes, what is the abstraction that explains this? What would more gen-
eral examples with the same abstraction look like? Why would it be easier to
see the answer in such generalized examples?
a a
• 71 is a prime number.
• If p is a prime number, then 2p − 1 is also a prime number.
• Every natural number is the sum of at most four square numbers. (Exam-
ple: 22 = 42 + 22 + 12 + 12 and 74 = 62 + 52 + 32 + 22 .)
Chapter 2 • Every even natural number greater than 2 can be expressed as the sum of
two primes.3 For example, 108 = 37 + 71 and 162 = 73 + 89.
• Any n lines ℓ1 , . . . , ℓn in the plane, no two of which are parallel, intersect
in one point (see Example 2.4).
Mathematical Reasoning, • For the chess game there exists a winning strategy for the player making
the first move (playing “white”).
Proofs, and a First Approach The first statement is easily shown to be true. The second statement is false,
and this can be proved by giving a counter-example: 11 is prime but 211 − 1 =
to Logic 2047 = 23 ·89 is not prime.4 The third statement is true but by no means obvious
(and requires a sophisticated proof). The fourth statement is not known to be
true (or false). The fifth statement is false. The sixth statement is not known to
be true (or false).
Example 2.1. Consider the following statement which sounds like a statement
2.1 Mathematical Statements of interest in Computer Science: “There is no algorithm for factoring any n-
bit integer in n3 steps”. This is not a precise mathematical statement because
its truth (namely the complexity of the best algorithm) generally depends on
2.1.1 The Concept of a Mathematical Statement the particular computational model one is considering. A goal of Theoretical
Computer Science (TCS) is therefore to define precise models of computation
People make many statements in life, like “I love you”, “tomorrow it will rain”,
and the complexity of algorithms, allowing to make such claims precise.
“birds can fly”, or “Roger Federer is the best tennis player”. By making the
statement, the person making it intends to claim that it is true. However, most If one makes a statement, say S, for example in the context of these lecture
such statements are not sufficiently precise to be considered true or false, and notes, there can be two different meanings. The first meaning is that by stating
often they are subjective. This is in contrast to mathematical statements. it one claims that S is true, and the second meaning is simply to discuss the
statement itself (for example as an assumption), independently of whether it is
Definition 2.1. A mathematical statement (also called proposition) is a statement true or not. We should try to distinguish clearly between these two meanings.
that is true or false in an absolute, indisputable sense, according to the laws of
mathematics.
2.1.2 Composition of Mathematical Statements
We often simply say “statement” instead of “mathematical statement”. A
mathematical statement that is known to be true is often called a theorem, a A mathematical statement can be composed of several mathematical statements.
lemma, or a corollary.1 A mathematical statement not known to be true or false Consider the following examples:
is often called a conjecture or an assumption. Sometimes, before a proof of a true
mathematical statement is given, it is also called assertion2 or claim. Examples of • 4 is an even number and 71 is a prime number.
• 5 is an even number and 71 is a prime number.
1 The term “theorem” is usually used for an important result, whereas a lemma is an interme-
diate, often technical result, possibly used in several subsequent proofs. A corollary is a simple 3 This statement is called the Goldbach conjecture and is one of the oldest unproven conjectures
7
9 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.2. The Concept of a Proof 10
• 5 is an even number or 71 is a prime number. • Cryptocurrency system C operates correctly as long as a majority of the
involved nodes behave honestly, even if all the other nodes behave arbi-
The first statement is true because both statements “4 is an even number” and trarily maliciously.
“71 is a prime number” are true. In contrast, the second statement is false be- • Database system D provides data privacy (for a suitable definition of pri-
cause the statement “5 is an even number” is false. However, the third statement vacy).
is again true because “71 is a prime number” is true and hence it is irrelevant
whether “5 is an even number” is true or false. Programs, algorithms, encryption schemes, etc., are (complex) discrete
A slightly more involved combination of two statements S and T is implica- mathematical objects, and proving statements like those mentioned above is
tion, where in mathematics one often writes highly non-trivial. This course is not about programs or algorithms, let alone
encryption schemes, but it provides the foundations so that later courses can
S =⇒ T. reason mathematically about these objects.
It stands for “If S is true, then T is true”. One also says “S implies T ”. The
statement S =⇒ T is false if S is true and T is false, and in all other three cases
it is true. In other words, the first three statements below are true while the last
2.2 The Concept of a Proof
one is false.
The purpose of a proof is to demonstrate (or prove) a mathematical statement S.
• 4 is an even number =⇒ 71 is a prime number. In this section we informally discuss the notion of a proof. We also discuss
several proof strategies. In Chapter 6 about logic, the notion of a proof in a
• 5 is an even number =⇒ 71 is a prime number.
proof calculus will be formalized.
• 5 is an even number =⇒ 70 is a prime number.
• 4 is an even number =⇒ 70 is a prime number.
2.2.1 Examples of Proofs
We point out that S =⇒ T does not express any kind of causality like “because S
is true, T is also true” (but see the discussion in Section 2.2.3). We already gave examples of proofs in Chapter 1. We give one more simple
Similarly, S ⇐⇒ T means that S is true if and only if T is true. This can example.
equivalently be stated as “S implies T and T implies S.”
Example 2.2. Claim: The number n = 2143 − 1 is not a prime.
Proof: n is divisible by 2047, as one can check by a (for a computer) simple
2.1.3 Mathematical Statements in Computer Science calculation.
That this is true can even be easily seen without doing a calculation, by prov-
Many statements relevant in Computer Science are mathematical statements ing a more general claim of which the above one is a special case:
which one would like to prove. We give a few examples of such statements:
Claim: n is not prime =⇒ 2n − 1 is not prime.5
• Program P terminates (i.e., does not enter an infinite loop) for all inputs. Proof: If n is not a prime, then (by definition of prime numbers) n = ab with
• Program P terminates within k computation steps for all inputs. a > 1 and a < n. Now we observe that 2a − 1 divides 2ab − 1:
• Program P computes f (x) for every input x, where f is a function of in- b−1
X
terest. 2ab − 1 = (2a − 1) 2ia = (2a − 1) 2(b−1)a + 2(b−2)a + · · · + 2a + 1
• Algorithm A solves problem S within accuracy ǫ. i=0
• The error probability of file transmission system F in a file transmission is
at most p (where p can be a function of the file length). as can easily be verified by a simple calculation. Since 2a − 1 > 1 and 2a − 1 <
2ab − 1, i.e., 2a − 1 is a non-trivial divisor of 2ab − 1, this means that 2ab − 1 is not
• The computer network C has the property that if any t links are deleted,
a prime, concluding the proof of the claim.
every node is still connected with every other node.
• Encryption scheme E is secure (for a suitable definition of security). 5 It is understood that this statement is meant to hold for an arbitrary n.
11 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.2. The Concept of a Proof 12
Let us state a warning. Recall from the previous section that Something must be wrong! (What?) This example illustrates that proofs
must be designed with care. Heuristics and intuition, though essential in any
n is prime =⇒ 2n − 1 is prime engineering discipline as well as in mathematics, can sometimes be wrong.
is a false statement, even though it may appear at first sight to follow from the Example 2.5. In the lecture we present a “proof” for the statement 2 = 1.
above claim. However, we observe that if S =⇒ T is true, then generally it does
not follow that if S is false, then T is false. 2.2.3 Two Meanings of the Symbol =⇒
Example 2.3. An integer n is called a square if n = m · m for some integer m.
It is important to note that the symbol =⇒ is used in the mathematical literature
Prove that if a and b are squares, then so is a · b.
for two different (but related) things:
a and b are squares • to express composed statements of the form S =⇒ T (see Section 2.1.2),
.
=⇒ a = u · u and b = v · v for some u and v (def. of squares)
.
• to express a derivation step in a proof, as above.
=⇒ a · b = (u · u) · (v · v) (replace a by u · u and b by v · v)
. .
=⇒ a · b = (u · v) · (u · v). (commutative and associative laws) To make this explicit and avoid confusion, we use a slightly different symbol =⇒
.
.
=⇒ a · b is a square (def. of squares) for the second meaning.8 Hence S =⇒ T means that T can be obtained from S
by a proof step, and in this case we also know that the statement S =⇒ T is true.
However, conversely, if S =⇒ T is true for some statements S and T , there may
The above proof follows a standard pattern of proofs as a sequence of im- .
not exist a proof step demonstrating this, i.e. S =⇒ T may not hold.
.
plications, each step using the symbol =⇒. Such a proof step requires that the .
justification for doing the step is clear. Often one justifies the proof step either in An analogous comment applies to the symbol ⇐⇒, i.e., S ⇐⇒ T can be used
the accompanying text or as a remark on the same line as the implication state- express that T follows from S by a simply proof step, and also S follows from T
ment (as in the above proof). But even more often the justification for the step by a simply proof step.
is simply assumed to be understood from the context and not explicitly stated,
which can sometimes make proofs hard to follow or even ambiguous. 2.2.4 Proofs Using Several Implications
Example 2.3 showed a proof of a statement of the form S =⇒ T using a se-
2.2.2 Examples of False Proofs . . .
quence of several implications of the form S =⇒ S2 , S2 =⇒ S3 , S3 =⇒ S4 , and
.
S4 =⇒ T .
As a next motivating example, let us prove a quite surprising assertion.6
A proof based on several implications often has a more general form: The
Example 2.4. Claim: Any n lines ℓ1 , . . . , ℓn in the plane, no two of which are implications do not form a linear sequence but a more general configuration,
parallel, intersect in one point (i.e., have one point in common). where each implication can assume several of the already proved statements.
For example, one can imagine that in order to prove a given statement T , one
Proof: The proof proceeds by induction.7 The induction basis is the case n = 2:
starts with two (known to be) true statements S1 and S2 and then, for some
Any two non-parallel lines intersect in one point. The induction hypothesis is
statements S3 , . . . , S7 , proves the following six implications:
that any n lines intersect in one point. The induction step states that then this
must be true for any n+1 lines. The proof goes as follows. By the hypothesis, the .
• S1 =⇒ S3 ,
n lines ℓ1 , . . . , ℓn intersect in a point P . Similarly, the n lines ℓ1 , . . . , ℓn−1 , ℓn+1 .
intersect in a point Q. The line ℓ1 lies in both groups, so it contains both P and • S1 =⇒ S4 ,
.
Q. The same is true for line ℓn−1 . But ℓ1 and ℓn−1 intersect at a single point, • S2 =⇒ S5 ,
hence P = Q. This is the common point of all lines ℓ1 , . . . , ℓn+1 . .
• S3 and S5 =⇒ S6 ,
6 This .
8 This notation is not standard and only used in these lecture notes. The symbol =⇒ is intention-
example is taken from the book by Matousek and Nesetril.
7 Here
we assume some familiarity with proofs by induction; in Section 2.6.10 we discuss them ally chosen very close to the symbol =⇒ to allow someone not used to this to easily overlook the
in depth. difference.
13 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.2. The Concept of a Proof 14
.
• S1 and S4 =⇒ S7 , as well as • Proof complexity and automatic verification. Certain proofs in Computer Sci-
.
• S6 and S7 =⇒ T . ence, like proving the correctness of a safety-critical program or the se-
curity of an information system, are too complex to be carried out and
Example 2.6. In the lecture we demonstrate the proof of Example 2.2 in the verified “by hand”. A computer is required for the verification. A com-
above format, making every intermediate statement explicit. puter can only deal with rigorously formalized statements, not with semi-
precise common language, hence a formal proof is required.9
2.2.5 An Informal Understanding of the Proof Concept • Precision and deeper understanding. Informal proofs often hide subtle steps.
A formal proof requires the formalization of the arguments and can lead
There is a common informal understanding of what constitutes a proof of a to a deeper understanding (also for the author of the proof).
mathematical statement S. Informally, we could define a proof as follows:
Definition 2.2. (Informal.) A proof of a statement S is a sequence of simple, eas- There is a trade-off between mathematical rigor and an intuitive, easy-to-
ily verifiable, consecutive steps. The proof starts from a set of axioms (things read (for humans) treatment. In this course, our goal is to do precise mathemat-
postulated to be true) and known (previously proved) facts. Each step corre- ical reasoning, but at the same time we will try to strike a reasonable balance
sponds to the application of a derivation rule to a few already proven state- between formal rigor and intuition. In Chapters 3 to 5, our proofs will be in-
ments, resulting in a newly proved statement, until the final step results in S. formal, and the Chapter 6 on logic is devoted to understanding the notion of a
formal proof.
Concrete proofs vary in length and style according to A main problem in teaching mathematical proofs (for example in this course)
is that it is hard to define exactly when an informal proof is actually a valid
• which axioms and known facts one is assuming, proof. In most scientific communities there is a quite clear understanding of
• what is considered to be easy to verify, what constitutes a valid proof, but this understanding can vary from commu-
nity to community (e.g. from physics to Computer Science). A student must
• how much is made explicit and how much is only implicit in the
learn this culture over the years, starting in high school where proof strategies
proof text, and
.
like proofs by induction have probably been discussed. There is no quick and
• to what extent one uses mathematical symbols (like =⇒) as opposed to easy path to understanding exactly what constitutes a proof.
just writing text. The alternative to a relatively informal treatment would be to do everything
rigorously, in a formal system as discussed in Chapter 6, but this would proba-
2.2.6 Informal vs. Formal Proofs bly turn away most students and would for the most parts simply not be man-
ageable. A book that tries to teach discrete mathematics very rigorously is A
Most proofs taught in school, in textbooks, or in the scientific literature are in- logical approach to discrete math by Gries and Schneider.
tuitive but quite informal, often not making the axioms and the proof rules ex-
plicit. They are usually formulated in common language rather than in a rigor-
ous mathematical language. Such proofs can be considered completely correct 2.2.7 The Role of Logic
if the reasoning is clear. An informal proof is often easier to read than a pedantic
Logic is the mathematical discipline laying the foundations for rigorous mathe-
formal proof.
matical reasoning. Using logic, every mathematical statement as well as a proof
However, a proof, like every mathematical object, can be made rigorous and for it (if a proof exists) can, in principle, be formalized rigorously. As mentioned
formally precise. This is a major goal of logic (see Section 2.2.7 and Chapter 6). above, rigorous formalization, and hence logic, is especially important in Com-
There are at least three (related) reasons for using a more rigorous and formal puter Science where one sometimes wants to automate the process of proving
type of proof. or verifying certain statements like the correctness of a program.
• Prevention of errors. Errors are quite frequently found in the scientific lit- Some principle tasks of logic (see Chapter 6) are to answer the following
erature. Most errors can be fixed, but some can not. In contrast, a com- three questions:
pletely formal proof leaves no room for interpretation and hence allows to 9 A crucial issue is that the translation of an informal statement to a formal statement can be
1. What is a mathematical statement, i.e., in which language do we write 2.3.1 Logical Constants, Operators, and Formulas
statements?
2. What does it mean for a statement to be true? Definition 2.3. The logical values (constants) “true” and “false” are usually de-
noted as 1 and 0, respectively.10
3. What constitutes a proof for a statement from a given set of axioms?
One can define operations on logical values:
Logic (see Chapter 6) defines the syntax of a language for expressing statements
and the semantics of such a language, defining which statements are true and
Definition 2.4.
which are false. A logical calculus allows to express and verify proofs in a purely
syntactic fashion, for example by a computer. (i) The negation (logical NOT) of a propositional symbol A, denoted as ¬A, is
true if and only if A is false.
(ii) The conjunction (logical AND) of two propositional symbol A and B, de-
2.2.8 Proofs in this Course noted A ∧ B, is true if and only if both A and B are true.
(iii) The disjunction (logical OR) of two propositional symbols A and B, de-
As mentioned above, in the literature and also in this course we will see proofs noted A ∨ B, is true if and only if A or B (or both) are true.11
at different levels of detail. This may be a bit confusing for the reader, especially
in the context of an exam question asking for a proof. We will try to be always The logical operators are functions, where ¬ is a function {0, 1} → {0, 1} and
clear about the level of detail that is expected in an exercise or in the exam. For ∧ and ∨ are functions {0, 1} × {0, 1} → {0, 1}. These functions can be described
this purpose, we distinguish between the following three levels: by function tables, as follows:
• Proof sketch or proof idea: The non-obvious ideas used in the proof are A B A∧B A B A∨B
described, but the proof is not spelled out in detail with explicit reference A ¬A 0 0 0 0 0 0
to all definitions that are used. 0 1 0 1 0 0 1 1
1 0 1 0 0 1 0 1
• Complete proof: The use of every definition is made explicit. Every proof
step is justified by stating the rule or the definition that is applied. 1 1 1 1 1 1
• Formal proof: The proof is entirely phrased in a given proof calculus. Logical operators can also be combined, in the usual way of combining func-
tions. For example, the formula
Proof sketches are often used when the proof requires some clever ideas and
the main point of a task or example is to describe these ideas and how they fit A ∨ (B ∧ C)
together. Complete proofs are usually used when one systematically applies the
definitions and certain logical proof patterns, for example in our treatments of has function table
relations and of algebra. Proofs in the resolution calculus in Chapter 6 can be A B C A ∨ (B ∧ C)
considered to be formal proofs. 0 0 0 0
0 0 1 0
0 1 0 0
2.3 A First Introduction to Propositional Logic 0 1 1 1
1 0 0 1
We give a brief introduction to some elementary concepts of logic. We point out 1 0 1 1
that this section is somewhat informal and that in the chapter on logic (Chap- 1 1 0 1
ter 6) we will be more rigorous. In particular, we will there distinguish between 1 1 1 1
the syntax of the language for describing mathematical statements (called for- 10 These values 1 and 0 are not meant to be the corresponding numbers, even though the same
mulas) and the semantics, i.e., the definition of the meaning (or validity) of a symbols are used.
formula. In this section, the boundary between syntax and semantics is (inten- 11 Sometimes ¬A, A ∧ B, and A ∨ B are also denoted as NOT(A), A AND B, and A OR B, respec-
A slightly more complicated example is (A ∧ (¬B)) ∨ (B ∧ (¬C)) with function Note that A ↔ B is equivalent to (A → B) ∧ (B → A) in the sense that the two
table formulas have the same function table.
A B C (A ∧ (¬B)) ∨ (B ∧ (¬C)) We now discuss a few notational simplifications. We have already seen that
0 0 0 0 parentheses can sometimes be dropped in a formula without changing its mean-
0 0 1 0 ing. For example we can write A ∨ B ∨ C instead of A ∨ (B ∨ C) or (A ∨ B) ∨ C.
0 1 0 1
There are also precedence rules for logical operators which allow to simplify
0 1 1 0
the notation, in the same sense as in algebra one can write ab + c rather than
1 0 0 1
(a · b) + c because · binds stronger than +. However, to keep things simple and
1 0 1 1
avoid confusion, we will generally not make use of such rules, except that we
1 1 0 1
adopt the convention that ¬ binds stronger than anything else. For example,
1 1 1 0
we can write ¬A ∧ B instead of (¬A) ∧ B, or we can write A → ¬B instead of
A → (¬B).
Definition 2.5. A correctly formed expression involving the propositional sym-
bols A, B, C, . . . and logical operators is called a formula (of propositional logic).
2.3.2 Formulas as Functions
We introduce a new, logical operator, implication, denoted as A → B and
defined by the function table An arithmetic expression such as (a+b)·c can be understood as a function. If we
consider as domain the natural numbers N, the arithmetic expression (a + b) · c
A B A→B corresponds to the function N3 → N assigning to every triple (a, b, c) the value
0 0 1 (a + b) · c, for example the value 42 to the triple (4, 2, 7) (because (4 + 2) · 7 = 42).
0 1 1 Analogously, a logical formula such as (A ∨ B) ∧ C can be interpreted as a
1 0 0 function from the set of truth assignments for the proposition symbols A, B, and
1 1 1 C to truth values, i.e., as a function {0, 1}3 → {0, 1}. For example, the function
evaluates to 1 for A = 0, B = 1, and C = 1.
Note that A → B is true if and only if A implies B. This means that when A Since in propositional logic12 the domain is finite, a function can be com-
is true, then also B is true. Note that A → B is false if and only if A is true and pletely characterized by a function table. For example, the function table of the
B is false, or, stated differently, if B is false despite that A is true. A → B can be function {0, 1}3 → {0, 1} corresponding to the formula (A ∧ (¬B)) ∨ (B ∧ (¬C))
understood as an alternative notation for ¬A ∨ B, which has the same function is shown in the previous section.
table.
Example 2.7. Consider the following sentence: If student X reads the lecture 2.3.3 Logical Equivalence and some Basic Laws
notes every week and does the exercises (A), then student X will get a good
grade in the exam (B). This is an example of an implication A → B. Saying that Different arithmetic expressions can correspond to the same function. For ex-
A → B is true does not mean that A is true and it is not excluded that B is true ample, the expressions (a + b) · c and (c · a) + (b · c) denote the same functions.
even if A is false, but it is excluded that B is false when A is true. Let’s hope the Analogously, different logical formulas can correspond to the same function.
statement A → B is true for you : -) .
Definition 2.6. Two formulas F and G (in propositional logic) are called equiva-
Two-sided implication, denoted A ↔ B, is defined as follows: lent, denoted as F ≡ G, if they correspond to the same function, i.e., if the truth
values are equal for all truth assignments to the propositional symbols appear-
A B A↔B ing in F or G.
0 0 1
0 1 0 For example, it is easy to see that ∧ and ∨ are commutative and associative,
1 0 0
12 but
1 1 1 not for other logics such as predicate logic
19 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.3. A First Introduction to Propositional Logic 20
which also means that A ∨ B ≡ ¬(¬A ∧ ¬B). In fact, ¬ and ∧ are sufficient to Definition 2.7. A formula G is a logical consequence13 of a formula F , denoted
express every logical function (of propositional logic). Similarly we have
F |= G,
¬(A ∧ B) ≡ ¬A ∨ ¬B. if for all truth assignments to the propositional symbols appearing in F or G,
the truth value of G is 1 if the truth value of F is 1.
Example 2.8. A ↔ B ≡ (A → B) ∧ (B → A) ≡ (A ∧ B) ∨ (¬A ∧ ¬B).
Intuitively, if we would interpret the truth values 0 and 1 as the numbers 0
Example 2.9. Here is a more complicated example which the reader can verify and 1 (which we don’t!), then F |= G would mean F ≤ G (as functions).
as an exercise:
Example 2.11. A ∧ B |= A ∨ B.
(A ∧ (¬B)) ∨ (B ∧ (¬C)) ≡ (A ∨ B) ∧ ¬(B ∧ C). Example 2.12. Comparing the truth tables of the two formulas (A ∧ B) ∨ (A ∧ C)
and ¬B → (A ∨ C) one can verify that
The following example shows a distributive law for ∧ and ∨. Such laws will
be discussed more systematically in Chapter 6. (A ∧ B) ∨ (A ∧ C) |= ¬B → (A ∨ C).
Example 2.10. (A ∧ B) ∨ C ≡ (A ∨ C) ∧ (B ∨ C). Note that the two formulas are not equivalent.
We summarize the basic equivalences of propositional logic: 13 German: (logische) Folgerung, logische Konsequenz
21 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.3. A First Introduction to Propositional Logic 22
Example 2.13. The following logical consequence, which the reader can prove Example 2.15. The formulas A∨(¬A) and (A∧(A → B)) → B are tautologies.
as an exercise, captures a fact intuitively known to us, namely that implication One often wants to make statements of the form that some formula F is a
is transitive:14 tautology. As stated in Definition 2.8, one also says “F is valid” instead of “F is
(A → B) ∧ (B → C) |= A → C. a tautology”.
We point out (see also Chapter 6) that two formulas F and G are equivalent
if and only if each one is a logical consequence of the other, i.e.,15 Definition 2.9. A formula F (in propositional logic) is called satisfiable18 if it is
true for at least one truth assignment of the involved propositional symbols, and
F ≡G ⇐⇒ F |= G and G |= F. it is called unsatisfiable otherwise.
2.3.5 Lifting Equivalences and Consequences to Formulas The symbol ⊤ is sometimes used to denote a tautology, and the symbol ⊥
is sometimes used to denote an unsatisfiable formula. One sometimes writes
Logical equivalences and consequences continue to hold if the propositional F ≡ ⊤ to say that F is a tautology, and F ≡ ⊥ to say that F is unsatisfiable. For
symbols A, B, C . . . are replaced by other propositional symbols or by formulas example, for any formula F we have
F, G, H . . .. At this point, we do not provide a proof of this intuitive fact. For
F ∨ ¬F ≡ ⊤ and F ∧ ¬F ≡ ⊥.
example, because of the logical consequences stated in the previous section we
have Example 2.16. The formula (A ∧ ¬A) ∧ (B ∨ C) is unsatisfiable, and the formula
F ∧G ≡ G∧F and F ∨G ≡ G∨F A ∧ B is satisfiable.
as well as
F ∧ (G ∧ H) ≡ (F ∧ G) ∧ H The following lemmas state two simple facts that follow immediately from
the definitions. We only prove the second one.
for any formulas F , G, and H.
The described lifting is analogous to the case of arithmetic expressions. For Lemma 2.2. A formula F is a tautology if and only if ¬F is unsatisfiable.
example, we have
(a + b) · c = (a · c) + (b · c)
for any real numbers a, b, and c. Therefore, for any arithmetic expressions f , g, Lemma 2.3. For any formulas F and G, F → G is a tautology if and only if F |= G.
and h, we have
(f + g) · h = (f · h) + (g · h).
Proof. The lemma has two directions which we need to prove. To prove the
Example 2.14. We give a more complex example of such a lifting. Because of first direction (=⇒), assume that F → G is a tautology. Then, for any truth
the logical consequence stated in Example 2.13, we have assignment to the propositional symbols, the truth values of F and G are either
(F → G) ∧ (G → H) |= F → H both 0, or 0 and 1, or both 1 (but not 1 and 0). In each of the three cases it
holds that G is true if F is true, i.e., F |= G. To prove the other direction (⇐=),
for any formulas F , G, and H.
assume F |= G. This means that for any truth assignment to the propositional
symbols, the truth values of G is 1 if it is 1 for F . In other words, there is no
2.3.6 Tautologies and Satisfiability truth assignment such that the truth value of F is 1 and that of G is 0. This
means that the truth value of F → G is always 1, which means that F → G is a
tautology.
Definition 2.8. A formula F (in propositional logic) is called a tautology16 or
valid17 if it is true for all truth assignments of the involved propositional sym- 2.3.7 Logical Circuits *
bols. One often writes |= F to say that F is a tautology.
A logical formula as discussed above can be represented as a tree where the leaves cor-
14 Theterm “transitive” will be discussed in Chapter 3. respond to the propositions and each node corresponds to a logical operator. Such a tree
15 Note that we DO NOT write F |= G ∧ G |= F because the symbol ∧ is used only between two
formulas in order to form a new (combined) formula, and F |= G and G |= F are not formulas. 17 German: allgemeingültig
16 German: Tautologie 18 German: erfüllbar
23 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.4. A First Introduction to Predicate Logic 24
can be implemented as a digital circuit where the operators correspond to the logical However, in many cases we write binary predicates in a so-called “infix” nota-
gates. This topic will be discussed in a course on the design of digital circuits19 . The two tion, i.e., we simply write x < y instead of less(x, y).
main components of digital circuits in computers are such logical circuits and memory For the universe of all human beings, we can define a binary predicate child
cells. as follows: child(x, y) = 1 if and only if x is y’s child. One can similarly define
predicates parent, grandparent, etc.
2.4 A First Introduction to Predicate Logic
The elements of logic we have discussed so far belong to the realm of so-called 2.4.2 Functions and Constants
propositional logic20 . Propositional logic is not sufficiently expressive to capture
In predicate logic one can also use functions on U and constants (i.e., fixed el-
most statements of interest in mathematics in terms of formulas. For example,
ements) in U . For example, if the universe is U = N, we can use the functions
the statement “There are infinitely many prime numbers” cannot be expressed as
add addition and multiplication mult and the constants 3 and 5. The formula
a formula in propositional logic (though it can of course be expressed as a sen-
tence in common language). We need quantifiers21 , predicates, and functions. The less(add(x, 3), add(x, 5))
corresponding extension of propositional logic is called predicate logic22 and is
substantially more involved than propositional logic. Again, we refer to Chap- can also be written in infix notation as
ter 6 for a more thorough discussion.
x + 3 < x + 5.
2.4.1 Predicates This is a true statement for every value x in U . In the next section we see how
we can express this as a formula.
Let us consider a non-empty set U as the universe in which we want to reason.
For example, U could be the set N of natural numbers, the set R of real numbers,
the set {0, 1}∗ of finite-length bit-strings, or a finite set like {0, 1, 2, 3, 4, 5, 6}. 2.4.3 The Quantifiers ∃ and ∀
23 k
Definition 2.10. A k-ary predicate P on U is a function U → {0, 1}.
Definition 2.11. For a universe U and predicate P (x) we define the following
A k-ary predicate P assigns to each list (x1 , . . . , xk ) of k elements of U the logical statements:24
value P (x1 , . . . , xk ) which is either true (1) or false (0).
∀x P (x) stands for: P (x) is true for all x in U .
For example, for U = N we can consider the unary (k = 1) predicate
prime(x) defined by ∃x P (x) stands for: P (x) is true for some x in U , i.e.,
there exists an x ∈ U for which P (x) is true.
1 if x is prime
prime(x) = More generally, for a formula F with a variable x, which for each value x ∈ U is
0 else.
either true or false, the formula ∀x F is true if and only if F is true for all x in
Similarly, one can naturally define the unary predicates even(x) and odd(x). U , and the formula ∃x F is true if and only if F is true for some x in U .
For any universe U with an order relation ≤ (e.g. U = N or U = R), the
binary (i.e., k = 2) predicate less(x, y) can be defined as Example 2.17. Consider the universe U = N. Then ∀x (x ≥ 0) is true.25 Also,
∀x (x ≥ 2) is false, and ∃x (x + 5 = 3) is false.
1 if x < y
less(x, y) = The name of the variable x is irrelevant. For example, the formula ∃x (x+5 =
0 else.
3) is equivalent to the formula ∃y (y + 5 = 3). The formula could be stated in
19 German: Digitaltechnik words as: “There exists a natural number (let us call it y) which, if 5 is added
20 German: Aussagenlogik
21 German: Quantoren 24 In the literature one also finds the notations ∀x: P (x) and ∀x. P (x) instead of ∀x P (x), and
22 German: Prädikatenlogik similarly for ∃.
23 German: Prädikat 25 But note that ∀x (x ≥ 0) is false for the universe U = R.
25 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.4. A First Introduction to Predicate Logic 26
to it, the result is 3.” How the number is called, x or y or z, is irrelevant for the Example 2.21. The statement “for every natural number there is a larger prime”
truth or falsity of the statement. (Of course the statement is false; it would be can be phrased as
true if the universe were the integers Z.) ∀x ∃y y > x ∧ prime(y)
Sometimes one wants to state only that a certain formula containing x is and means that there is no largest prime and hence that there are infinitely many
true for all x that satisfy a certain condition. For example, to state that x2 ≥ 25 primes.
whenever x ≥ 5, one can write
If the universe is N, then one sometimes uses m, n, or k instead of x and y.
∀x (x ≥ 5) → (x2 ≥ 25) . The above formula could hence equivalently be written as
A different notation sometimes used to express the same statement is to state ∀m ∃n n > m ∧ prime(n) .
the condition on x directly after the quantifier, followed by “:”26
Example 2.22. Let U = R. What is the meaning of the following formula, and
∀x ≥ 5 : (x2 ≥ 25). does it correspond to a true statement?
∀x x = 0 ∨ ∃y (xy = 1)
2.4.4 Nested Quantifiers
Example 2.23. What is the meaning of the following formula, and for which
Quantifiers can also be nested27 . For example, if P (x) and Q(x, y) are predicates, universes is it true (or false)?
then
∀x P (x) ∨ ∃y Q(x, y) ∀x ∀y (x < y) → ∃z((x < z) ∧ (z < y))
is a logical formula.
2.4.5 Interpretation of Formulas
Example 2.18. The formula
∀x ∃y (y < x) A formula generally has some “free parts” that are left open for interpretation.
To begin with, the universe is often not fixed, but it is also quite common to
states that for every x there is a smaller y. In other words, it states that there is write formulas when the universe is understood and fixed. Next, we observe
no smallest x (in the universe under consideration). This formula is true for the that the formula
universe of the integers or the universe of the real numbers, but it is false for the
∀x P (x) → Q(x)
universe U = N.
contains the predicate symbols P and Q which can be interpreted in different
Example 2.19. For the universe of the natural numbers, U = N, the predicate ways. Depending on the choice of universe and on the interpretation of P and
prime(x) can be defined as follows:28 Q, the formula can either be true or false. For example let the universe be N and
let P (x) mean that “x is divisible by 4”. Now, if Q(x) is interpreted as “x is odd”,
def
prime(x) ⇐⇒ x > 1 ∧ ∀y ∀z (yz = x) → ((y = 1) ∨ (z = 1)) . then ∀x (P (x) → Q(x)) is false, but if Q(x) is interpreted as “x is even”, then
∀x (P (x) → Q(x)) is true. However, the precise definition of an interpretation
Example 2.20. Fermat’s last theorem can be stated as follows: For universe is quite involved and deferred to Chapter 6.
N \ {0},29
¬ ∃ x ∃y ∃z ∃n (n ≥ 3 ∧ xn +y n = z n ) .
2.4.6 Tautologies and Satisfiability
26 We don’t do this.
27 German: verschachtelt
The concepts interpretation, tautology, and satisfiability for predicate logic will
def
28 We use the symbol “⇐⇒” be defined in Chapter 6.
if the object on the left side is defined as being equivalent to the object
on the right side.
29 In formulas with sequences of quantifiers of the same type one sometimes omits parentheses or Informally, a formula is satisfiable if there is an interpretation of the involved
even multiple copies of the quantifier. For example one writes ∃xyz instead of ∃x ∃y ∃z. We will symbols that makes the formula true. Hence ∀x (P (x) → Q(x)) is satisfiable
not use such a convention in this course. as shown above. Moreover, a formula is a tautology (or valid) if it is true for all
27 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.5. Logical Formulas vs. Mathematical Statements 28
interpretations, i.e., for all choices of the universe and for all interpretations of since if P (x) is true for all x and also Q(x) is true for all x, then P (x) ∧ Q(x) is
the predicates.30 true for all x, and vice versa. Also,31
We will use the terms “tautology” and “valid” interchangeably. For example,
∃x P (x) ∧ Q(x) |= ∃x P (x) ∧ ∃x Q(x)
∀x (P (x) ∧ Q(x)) → (P (x) ∨ Q(x)) since, no matter what P and Q actually mean, any x that makes P (x)∧Q(x) true
(according to the left side) also makes P (x) and Q(x) individually true. But, in
is a tautology, or valid. contrast, ∃x (P (x) ∧ Q(x)) is not a logical consequence of ∃x P (x) ∧ ∃x Q(x), as
the reader can verify. We can write
2.4.7 Equivalence and Logical Consequence ∃x P (x) ∧ ∃x Q(x) 6|= ∃x (P (x) ∧ Q(x)).
One can define the equivalence of formulas and logical consequence for predi- We also have:
cate logic analogously to propositional logic, but again the precise definition is ¬∀x P (x) ≡ ∃x ¬P (x)
quite involved and deferred to Chapter 6. Intuitively, two formulas are equiva- and
lent if they evaluate to the same truth value for any interpretation of the symbols ¬∃x P (x) ≡ ∀x ¬P (x).
in the formula.
The reader can prove as an exercise that
Example 2.24. Recall Example 2.22. The formula can be written in an equivalent
form, as: ∃y ∀x P (x, y) |= ∀x ∃y P (x, y)
but that
∀x x = 0 ∨ ∃y (xy = 1) ≡ ∀x ¬(x = 0) → ∃y (xy = 1) . ∀x ∃y P (x, y) 6|= ∃y ∀x P (x, y).
The order of identical quantifiers does not matter, i.e., we have for example:
2.5 Logical Formulas vs. Mathematical Statements
∃x∃y P (x, y) ≡ ∃y∃x P (x, y) and ∀x∀y P (x, y) ≡ ∀y∀x P (x, y).
A logical formula is generally not a mathematical statement because the symbols
A simple example of a logical consequence is in it can be interpreted differently, and depending on the interpretation, the
formula is true or false. Without fixing an interpretation, the formula is not a
∀x P (x) |= ∃x P (x). mathematical statement.
It holds because if P (x) is true for all x in the universe, then it is also true for
some (actually an arbitrary) x. (Recall that the universe is non-empty.) 2.5.1 Fixed Interpretations and Formulas as Statements
Some more involved examples of equivalences and logical consequences are
stated in the next section. If for a formula F the interpretation (including the universe and the meaning
of the predicate and function symbols) is fixed, then this can be a mathematical
statement that is either true or false. Therefore, if an interpretation is under-
2.4.8 Some Useful Rules stood, we can use formulas as mathematical statements, for example in a proof
with implication steps. In this case (but only if a fixed interpretation is under-
We list a few useful rules for predicate logic. This will be discussed in more stood) it is also meaningful to say that a formula is true or that it is false.
detail in Chapter 6. We have
Example 2.25. For the universe N and the usual interpretation of < and >, the
formula ∃n (n < 4 ∧ n > 5) is false and the formula ∀n (n > 0 → (∃m m < n))
∀x P (x) ∧ ∀x Q(x) ≡ ∀x P (x) ∧ Q(x)
is true.
30 We will see in Chapter 6 that predicate logic also involves function symbols, and an interpreta- 31 We point out that defining logical consequence for predicate logic is quite involved (see Chap-
tion also instantiates the function symbols by concrete functions. ter 6), but intuitively it should be quite clear.
29 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.6. Some Proof Patterns 30
2.5.2 Mathematical Statements about Formulas The soundness of this principle is explained by the following lemma of
propositional logic which was already stated in Example 2.13.
As mentioned, logical formulas are often not mathematical statements them-
selves, but one makes mathematical statements about formulas. Examples of Lemma 2.5. (A → B) ∧ (B → C) |= A → C.
such mathematical statements are:
Proof. One writes down the truth tables of the formulas (A → B) ∧ (B → C)
• F is valid (i.e., a tautology, also written as |= F ), and A → C and checks that whenever the first evaluates to true, then also the
• F is unsatisfiable, second evaluates to true.
• F |= G.
as a mathematical statement about the formulas F and G. This statement is Definition 2.13. A direct proof of an implication S =⇒ T works by assuming S
different from the statement F |= G. In fact, for any formulas F and G, the and then proving T under this assumption.
statement F |= G implies statement (2.1), but the converse is generally false:
Lemma 2.4. For any two formulas F and G, if F |= G, then (2.1) is true.
2.6.3 Indirect Proof of an Implication
Proof. F |= G states that for every interpretation, if F is true (for that interpre-
tation), then also G is true (for that interpretation). Therefore, if F is true for Definition 2.14. An indirect proof of an implication S =⇒ T proceeds by assum-
every interpretation, then also G is true for every interpretation, which is state- ing that T is false and proving that S is false, under this assumption.
ment (2.1).
The soundness of this principle is explained by the following simple lemma
of propositional logic, where A stands for “statement S is true” and B stands
2.6 Some Proof Patterns for “statement T is true”.
In this section we discuss a few important proof patterns (which we could also Lemma 2.6. ¬B → ¬A |= A → B.
call proof methods or proof techniques). Such a proof pattern can be used to
prove one step within a longer proof, or sometimes also to directly prove a the- Proof. One can actually prove the stronger statement, namely that ¬B → ¬A ≡
orem of interest. Many proof patterns correspond to logical deduction rules. A → B, simply by examination of the truth table which is identical for both
One can define a logical calculus consisting of such deduction rules, but we will formulas ¬B → ¬A and A → B.
defer the discussion of this topic to Chapter 6. Often, a given statement can be √
proved in different ways, i.e., by using different proof patterns. Example 2.26. Prove the following claim: If x > 0 is irrational,
√ then also x is
irrational. The indirect proof proceeds by assuming that x is not irrational and
showing that then x is also not irrational. Here “not irrational” means rational,
2.6.1 Composition of Implications i.e., we prove
√
x is rational =⇒ x is rational
We first explain why the composition of implications, as occurring in many √ √
proofs, is sound. Assume hence that x is rational, i.e., that x = m/n for m, n ∈ N. This means
that x = m2 /n2 , i.e., x is the quotient of two natural numbers (namely m2 and
Definition 2.12. The proof step of composing implications is as follows: If S =⇒ T n2 ) and thus is rational. This completes the proof of the claim.
and T =⇒ U are both true, then one concludes that S =⇒ U is also true. 32 Recall Section 2.1.2.
31 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.6. Some Proof Patterns 32
2.6.4 Modus Ponens To prove the statement, let n = 5k + c, where 1 ≤ c ≤ 4. Using the usual
binomial formula (a + b)4 = a4 + 4a3 b + 6a2 b2 + 4ab3 + b4 we obtain:
Definition 2.15. A proof of a statement S by use of the so-called modus ponens
n4 = (5k + c)4 = 54 k 4 + 4 · 53 k 3 c + 6 · 52 k 2 c2 + 4 · 5kc3 + c4 .
proceeds in three steps:
1. Find a suitable mathematical statement R. Each summand is divisible by 5, except for the last term c4 . The statement S is
hence equivalent to the statement that c4 is one more than a multiple of 5, for
2. Prove R. 1 ≤ c ≤ 4.
3. Prove R =⇒ S. This statement S can be proved by case distinction, i.e., by considering all
The soundness of this principle is explained by the following lemma of four choices for c. For c = 1 we have c4 = 1, which is trivially one more than a
propositional logic. Again, the proof is by a simple comparison of truth tables. multiple of 5. For c = 2 we have c4 = 16, which is also one more than a multiple
of 5. The same is true for c = 3 where c4 = 81 and for c = 4 where c4 = 256.
Lemma 2.7. A ∧ (A → B) |= B. This concludes the proof.
With a few insights from number theory and algebra we will see later that
Examples will be discussed in the lecture and the exercises. the above statement holds when 5 is replaced by any odd prime number.
Definition 2.16. A proof of a statement S by case distinction proceeds in three Definition 2.17. A proof by contradiction of a statement S proceeds in three steps:
steps:
1. Find a suitable mathematical statement T .
1. Find a finite list R1 , . . . , Rk of mathematical statements (the cases).
2. Prove that T is false.
2. Prove that one of the Ri is true (one case occurs).
3. Assume that S is false and prove (from this assumption) that T is true (a
3. Prove Ri =⇒ S for i = 1, . . . , k.
contradiction).
More informally, one proves for a complete list of cases that the statement S
In many cases, the proof steps appear in a different order: One starts from
holds in all the cases.
assuming that S is false, derives statements from it until one observes that one
The soundness of this principle is explained by the following lemma of of these statements is false (i.e., is the statement T in the above description). In
propositional logic. this case, the fact that T is false (step 2) is obvious and requires no proof.
Lemma 2.8. For every k we have The soundness of this principle is explained by the following lemma of
propositional logic which can again be proved by comparing the truth tables
(A1 ∨ · · · ∨ Ak ) ∧ (A1 → B) ∧ · · · ∧ (Ak → B) |= B. of the involved formulas.
that a number a is rational if and only if a = m/n (i.e., m = an) for two relatively Example 2.29. Prove that there exists a prime36 number n such that n − 10 and
prime33 integers m and n (i.e., with gcd(m, n) = 1).34 n + 10 are also primes, i.e., prove
The proof by contradiction starts by assuming that S is false and deriving,
from this assumption, a false statement T . In the following derivation we may ∃n prime(n) ∧ prime(n − 10) ∧ prime(n + 10) .
| {z }
use formulas as a compact way of writing statements, but the derivation itself Sn
is “normal” mathematical reasoning and is not to be understood as a formula-
based logical reasoning.35 A constructive proof is obtained by giving the example n = 13 and verifying
. √ that S13 is true.
S is false ⇐⇒ 2 is irrational
.
⇐⇒ ∃m ∃n m2 = 2n2 ∧ gcd(m, n) = 1 Example 2.30. We prove that there are infinitely many primes by involving a
non-constructive existence proof.37 This statement can be rephrased as follows:
We now consider, in isolation, the statement m2 = 2n2 appearing in the above For every number m there exists a prime p greater than m; as a formula:
formula, derive from it another statement (namely gcd(m, n) ≥ 2), and then
plug this into the above formula. Each step below is easy to verify. For arbitrary ∀m ∃p prime(p) ∧ p > m .
m and n we have | {z }
Sp
.
m2 = 2n2 =⇒ m2 is even
. To prove this, consider an arbitrary but fixed number m and consider the state-
=⇒ m is even
. ments Sp parameterized by p: There exists a prime p greater than m, i.e., such
=⇒ 4 teilt m2
. that prime(p) ∧ p > m is true.
=⇒ 4 teilt 2n2 (also using m2 = 2n2 )
. To prove this, we use the known fact (which has been proved) that every
=⇒ 2 teilt n2 natural number n ≥ 2 has at least one prime divisor. We consider the specific
.
=⇒ n is even number m! + 1 (where m! = 2 · 3 · · · (m − 1) · m). We observe that for all k in
. the range 2 ≤ k ≤ m, k does not divide m! + 1. In particular, no prime smaller
=⇒ gcd(m, n) ≥ 2 (also using that m is even)
than m divides m! + 1. Because m! + 1 has at least one prime divisor, there exists
Hence we have a prime p greater than m which divides m! + 1. Hence there exists a prime p
∃m ∃n m2 = 2n2 ∧ gcd(m, n) = 1 greater than m.38
.
=⇒ ∃m ∃n m2 = 2n2 ∧ gcd(m, n) ≥ 2 ∧ gcd(m, n) = 1 .
| {z }
false for arbitrary m and n 2.6.8 Existence Proofs via the Pigeonhole Principle
| {z }
statement T , which is false The following simple but powerful fact is known as the pigeonhole principle39 .
This concludes the proof by contradiction. This principle is used as a proof technique for certain existence proofs.40
Theorem 2.10. If a set of n objects is partitioned into k < n sets, then at least one of
2.6.7 Existence Proofs these sets contains at least ⌈ nk ⌉ objects.41
36 Recall that prime(n) is the predicate that is true if and only if n is a prime number.
Definition 2.18. Consider a set X of parameters and for each x ∈ X a statement, 37 See also Example 2.21, where different variable names are used.
denoted Sx . An existence proof is a proof of the statement that Sx is true for at 38 Note that p is not known explicitly, it is only known to exist. In particular, p is generally not
least one x ∈ X . An existence proof is constructive if it exhibits an a for which Sa equal to m! + 1.
is true, and otherwise it is non-constructive. 39 German: Schubfachprinzip
40 This principle is often described as follows: If there are more pigeons than pigeon holes, then
33 German: teilerfremd there must be at least one pigeon hole with more than one pigeon in it. Hence the name of the
34 gcd(m, n) denotes the greatest common divisor of m and n (see Section 4.2.3). principle.
35 We can write ⇐⇒. 41 In the literature, the pigeon hole principle often states only that there must be a set containing
if the implication holds in both directions, but it would be sufficient to always
. .
replace ⇐⇒ by =⇒. at least two elements.
35 Chapter 2. Math. Reasoning, Proofs, and a First Approach to Logic 2.6. Some Proof Patterns 36
Proof. The proof is by contradiction. Suppose that all sets in the partition have 2.6.9 Proofs by Counterexample
at most ⌈ nk ⌉ − 1 objects. Then the total number of objects is at most k ⌈ nk ⌉ − 1 ,
which is smaller than n because Proofs by counterexample are a specific type of constructive existence proof,
l n m n n namely the proof that a counterexample exists.
k −1 < k +1 −1 = k = n.
k k k
Definition 2.19. Consider a set X of parameters and for each x ∈ X a statement,
Example 2.31. Claim: Among 100 people, there are at least nine who were born denoted Sx . A proof by counterexample is a proof of the statement that Sx is not
in the same month. The claim can be equivalently stated as an existence claim: true for all x ∈ X , by exhibiting an a (called counterexample) such that Sa is false.
Considering any 100 people, there exists a month in which at least nine of them
Note that a proof by counterexample corresponds to an existence proof.
have their birthday.
Example 2.34. Prove or disprove that for every integer n, the number n2 −n+41
Proof. Set n = 100 and k = 12, and observe that ⌈100/12⌉ = 9.
is prime, i.e., prove
Example 2.32. Claim: In any subset A of {1, 2, . . . , 2n} of size |A| = n + 1, there ∀n prime(n2 − n + 41).
exist distinct a, b ∈ A such that a | b (a divides b).42 One can verify the quite surprising fact that prime(n2 − n + 41) is true for
For example, in the set {2, 3, 5, 7, 9, 10} we see that 3 | 9. n = 1, 2, 3, 4, 5, 6, . . ., for as long as the reader has the patience to continue to do
the calculation. But is it true for all n? To prove that the assertion ∀n prime(n2 −
Proof. We write every ai ∈ A as 2ei ui with ui odd. There are only n possible
n + 41) is false, i.e., to prove
values {1, 3, 5, . . . , 2n − 1} for ui . Thus there must exist two numbers ai and aj
with the same odd part (ui = uj ). Therefore one of them has fewer factors 2 ¬∀n prime(n2 − n + 41),
than the other and must hence divide it. it suffices to exhibit a counterexample, i.e., an a such that ¬prime(a2 − a + 41).
The smallest such a is a = 41; note that 412 − 41 + 41 = 412 is not a prime.
Example 2.33. Let a1 , a2 , . . . , an be a sequence of numbers (real or inte-
ger). A subsequence of length k of this sequence is a sequence of the form Example 2.35. Prove or disprove that every positive integer ≥ 10 can be written
ai1 , ai2 , . . . , aik , where 1 ≤ i1 < i2 < · · · < ik ≤ n. A sequence is called strictly as the sum of at most three squares (e.g. 10 = 32 + 12 , 11 = 32 + 12 + 12 ,
increasing (decreasing) if each term is strictly greater (smaller) than the preced- 12 = 22 + 22 + 22 , 13 = 32 + 22 , and 14 = 32 + 22 + 12 .). The statement can be
ing one. For example, the sequence 3, 8, 2, 11, 1, 5, 7, 4, 14, 9 contains the increas- written as
ing subsequences 3, 5, 7, 9 and 2, 5, 7, 14 and the decreasing subsequences 3, 2, 1 ∀n n ≥ 10 → ∃ x ∃y ∃z (n = x2 + y 2 + z 2 ) .
and 8, 5, 4. The statement is false because n = 15 is a counterexample.
Claim: Every sequence of m2 + 1 distinct numbers (real or integer) contains ei-
ther an increasing or a decreasing subsequence of length m + 1. (Note that in
the above example, m = 3 and m2 + 1 = 10, and there is indeed an increasing
2.6.10 Proofs by Induction
subsequence of length 4.) One of the most important proof technique in discrete mathematics are proofs
Proof. We can associate with every position (1 ≤ ℓ ≤ m2 + 1) a pair (iℓ , dℓ ), by induction, which are used to prove statements of the form ∀n P (n), where
where iℓ (dℓ ) is the length of the longest increasing (decreasing) subsequence the universe U is the set N = {0, 1, 2, 3, . . .} of natural numbers. Alternatively,
beginning at position ℓ. The proof is by contradiction. Suppose the claim is it can also be the set {1, 2, 3, . . .} of positive integers, in which case P (0) below
false, i.e., 1 ≤ iℓ ≤ m and 1 ≤ dℓ ≤ m for all ℓ. Then there are at most m2 pairs must be replaced by P (1). More generally, it can be the set {k, k + 1, k + 2, . . .}
(iℓ , dℓ ) that can occur. Thus the pigeonhole principle guarantees that there must for some k.
be two indices s < t with (is , ds ) = (it , dt ). But this leads to a contradiction. A proof by induction consists of two steps:
Because the numbers are distinct, either as < at or as > at . If as < at , then,
since is = it , an increasing subsequence of length it + 1 can be built starting Proof by induction:
at position s, taking as followed by the increasing subsequence beginning at at . 1. Basis step.43 Prove P (0).
This is a contradiction. A similar contradiction is obtained if as > at .
2. Induction step. Prove that for any arbitrary n we have P (n) =⇒ P (n+1).
42 Note that this is tight. If we lower |A| from n + 1 to n, then the set A = {n, n + 1, . . . , 2n − 1}
Theorem 2.11. For the universe N and an arbitrary unary predicate P we have
38
39 Chapter 3. Sets, Relations, and Functions 3.2. Sets and Operations on Sets 40
• {a, b} ∪ {b, c} = {a, b, c}, The problem with Cantor’s intuitive set theory is that, because it was not
clearly axiomatized, it makes the following apparently innocent (yet false) as-
as well as with simple definitions like the following: sumption. Whenever one specifies a precise condition (i.e., a logical predicate
P ), allowing to distinguish between objects that satisfy the predicate and ob-
Definition 3.1. (Informal.) The number of elements of a finite set A is called its
jects that don’t, then {x | P (x)}, the set of objects satisfying the predicate is
cardinality and is denoted |A|.
well-defined. Russell proposed the set
Also, facts like
R = {A | A ∈
/ A}
A ⊆ B ∧ B ⊆ C =⇒ A ⊆ C
or of sets that are not elements of themselves. Note that there seems to be nothing
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) wrong with a set being an element of itself. For example, the set of sets contain-
ing at least 10 elements seems to be an element of itself, as it contains more than
are well-known and seem obvious if one draws a figure with intersecting cir-
10 elements. Similarly, the set of sets containing at most 10 elements is not an
cles representing sets (so-called Venn-diagrams). However, many issues do not
element of itself.
seem to be clear mathematically, for example:
Either R ∈ R or R ∈ / R. If R ∈ R, then by the definition of R, this implies
• Which objects can one use as elements of a set? R∈ / R, a contradiction. Thus the only alternative is R ∈ / R. But again, by the
• Can a set itself be an element of a set? definition of R, this implies R ∈ R, again a contradiction. In other words, R ∈ R
• Can a set be an element of itself? if and only if R ∈ / R, a paradox that requires an explanation.
• How is set intersection or set union defined? The problem, subsequently addressed by Zermelo’s axiomatization, is the
• How should the elements of a set be counted? following: While for any set B and predicate P , {x ∈ B | P (x)} is actually a
• Do the above-stated facts require a proof, or are they just “obvious” in an well-defined set, {x | P (x)} is not. We must have a set to begin with before being
informal sense? able to create new sets by selecting the elements satisfying a certain condition. In
other words, the universe U of objects one has in mind when writing {x | P (x)}
This calls for a precise mathematical treatment with clear definitions, lemmas, is not itself a set.4
and proofs. The need for such a precise treatment also becomes obvious when
considering Russell’s paradox discussed below.
3.2 Sets and Operations on Sets
3.1.2 Russell’s Paradox
3.2.1 The Set Concept
The set concept introduced by Cantor and axiomatized further by Frege seemed
very innocent and safe to work with. But in 1903, Bertrand Russell3 (1872-1970) In set theory one postulates that there is a universe of possible sets and a uni-
showed that set theory as understood at that point in time is inherently con- verse of objects which can be elements of sets. Nothing prevents us from think-
tradictory. This came as a shock to the mathematics community. As a con- ing that the two universes are the same, i.e., the elements of sets are also sets.
sequence, set theory had to be based on much more rigorous grounds, on an We further postulate a binary predicate E to be given, and if E(x, y) = 1 we say
axiomatic foundation, a process started by Ernst Zermelo. It is still an active that x is an element of y. We can call E the elementhood predicate. Instead of the
area of research in mathematics which axioms can and should be used as the predicate E we use an infix notation and write x ∈ y rather than E(x, y) = 1.
foundation of set theory. The most widely considered set of axioms is called We also use the short-hand x 6∈ y for ¬(x ∈ y), i.e., if x is not an element of y.
Zermelo-Fraenkel (ZF) set theory. Axiomatic set theory is beyond the scope of Now we can postulate certain properties that the elementhood predicate E
this course. should satisfy, capturing the essence of set theory. This makes explicit that E is
3 Russell was a very remarkable person. He was not only an outstanding philosopher and math- not some arbitrary predicate, but that it really captures natural properties of sets.
ematician, but also politically active as a pacifist. Because of his protests against World War I he In a systematic mathematical approach, one carefully chooses a list of axioms
was dismissed from his position at Trinity College in Cambridge and imprisoned for 6 months. In
1961, at the age of 89, he was arrested again for his protests against nuclear armament. In 1950 he 4 In fact, in Zermelo-Fraenkel (ZF) set theory, the axioms exclude that a set can be an element of
and develops a theory (set theory) based on these axioms. There are indeed element, namely a, that is contained in the first set, but not in the second. Thus
several different (but related) axiom systems for set theory, and it is beyond the we have proved that a 6= b =⇒ {a} 6= {b}. According to Definition 2.14, this
scope of this course to discuss set theory in a formal sense.5 However, we will proves {a} = {b} =⇒ a = b.
informally introduce some of these properties/axioms in order to arrive at a
sufficiently precise treatment of sets. Note that, in contrast, {a, b} = {c, d} neither implies that a = c nor that b = d.
When writing formulas, it will often be convenient to not only use the usual In a set, say {a, b}, there is no order of the elements, i.e.,
logical variable symbols x, y, etc., but to use in the same formula symbols like A,
B, etc. This is convenient because it makes the formulas look closer to how set {a, b} = {b, a}.
theory is usually informally discussed. However, whether we use the symbol x
However, in mathematics one wants to also define the concept of an (ordered)
or A for a set is not mathematically relevant.
list of objects. Let us consider the special case of ordered pairs. For the operation
of forming an ordered pair of two objects a and b, denoted (a, b), we define
3.2.2 Set Equality and Constructing Sets From Sets def
(a, b) = (c, d) ⇐⇒ a = c ∧ b = d.
6
A set is completely specified by its elements, regardless of how it is described.
There is no other relevant information about a set than what its elements are. In Example 3.1. This example shows that one can model ordered pairs by using
other words, two sets A and B are equal (A = B) if (and only if) they contain the only (unordered) sets?8 This means that the sets corresponding to two ordered
same elements, independently of how A and B are described. In other words, pairs must be equal if and only if the ordered pairs are equal. A first approach is
def
there can not be two different sets A and B which contain exactly the same to define (a, b) = {a, {b}}. However, this definition of an ordered pair fails be-
elements. This is called the axiom of extensionality in set theory. Since we are cause one could not distinguish whether the set {{b}, {c}} denotes the ordered
not aiming for an axiomatic treatment of set theory, we state this simply as a pair ({b}, c) or the ordered pair ({c}, b). The reader can verify as an exercise that
definition. the following definition is correct:
def def
Definition 3.2. A = B ⇐⇒ ∀x (x ∈ A ↔ x ∈ B). (a, b) = {{a}, {a, b}}.
We postulate7 that if a is a set, then the set containing exactly (and only) a
exists, and is usually denoted as {a}. Similarly, for any finite list of sets, say a, b, 3.2.3 Subsets
and c, the set containing exactly these elements exists and is usually denoted as
{a, b, c}. Definition 3.3. The set A is a subset of the set B, denoted A ⊆ B, if every
Since a set is specified by its elements, we can conclude that if two sets, each element of A is also an element of B, i.e.,
containing a single element, are equal, then the elements are equal. This can be def
stated as a lemma (in set theory), and it actually requires a proof. A ⊆ B ⇐⇒ ∀x (x ∈ A → x ∈ B).
Lemma 3.1. For any (sets) a and b, {a} = {b} =⇒ a = b. The following lemma states an alternative way for capturing the equality of
sets, via the subset relation. In fact, this is often the best way to prove that two
sets are equal.
Proof. Consider any fixed a and b. The statement is an implication, which we
Lemma 3.2. A = B ⇐⇒ (A ⊆ B) ∧ (B ⊆ A).
prove indirectly. Assume that a 6= b. Then {a} =
6 {b} because there exists an
5 Indeed, mathematicians are still working on fundamental questions regarding the theory of sets
(but not questions relevant to us). Proof. The proof first makes use (twice) of Definition 3.3, then uses the fact from
6 For example, the set containing exactly the three natural numbers 1, 2, and 3 has many different
predicate logic that ∀F ∧ ∀G ≡ ∀(F ∧ G), then uses the fact from propositional
descriptions, including {1, 2, 3}, {3, 1, 2}, {1, 1 + 1, 1 + 1 + 1}, etc. All these descriptions refer to
the same set. 8 We briefly address this question, although we will not make use of this later and will continue
7 In axiomatic set theory this is guaranteed by appropriate axioms. to think about ordered pairs and lists in a conventional sense and with conventional notation.
43 Chapter 3. Sets, Relations, and Functions 3.2. Sets and Operations on Sets 44
logic that (C → D)∧(D → C) ≡ C ↔ D,9 and then makes use of Definitions 3.2. Example 3.2. Consider the set of sets
For any sets A and B we have the following equivalences of statements about A
and B: A = {a, b, c, d}, {a, c, e}, {a, b, c, f }, {a, c, d} .
.
(A ⊆ B) ∧ (B ⊆ A) ⇐⇒ ∀x (x ∈ A → x ∈ B) ∧ ∀x (x ∈ B → x ∈ A) S T
. Then we have A = {a, b, c, d, e, f } and A = {a, c}.
⇐⇒ ∀x (x ∈ A → x ∈ B) ∧ (x ∈ B → x ∈ A)
.
⇐⇒ ∀x (x ∈ A ↔ x ∈ B) Typically, the sets (elements) in a set A of sets are indexed by some index
.
⇐⇒ A = B set I: A = {Ai | i ∈ I}. In this also writes {Ai }i∈I , and for the intersec-
T case, one S
tion and union one writes i∈I Ai and i∈I Ai , respectively.
The next lemma states that the subset relation is transitive (a term discussed Definition 3.5. The difference of sets B and A, denoted B\A is the set of elements
later). The proof is left as an exercise. of B without those that are elements of A:
def
Lemma 3.3. For any sets A, B, and C, B \ A = {x ∈ B| x ∈
/ A}.
Lemma 3.5. There is only one empty set (which is often denoted as ∅ or {}).10 ∅, i.e., {∅} 6= ∅. Note that |{∅}| = 1 while |∅| = 0. One can thus define a whole
sequence of such sets:
Proof. Let ∅ and ∅′ both be arbitrary empty sets. Since both are empty, every
∅, {∅}, {{∅}}, {{{∅}}}, . . . .
element that is in ∅ (namely none) is also in ∅′ , and vice versa. This means
according to Definition 3.2 that ∅ = ∅′ , which means that there is only one Note that, except for the empty set, all these sets have cardinality 1.
empty set.
Example 3.3. A few other sets constructed from the empty set are:
A = {∅, {∅}},
Lemma 3.6. The empty set is a subset of every set, i.e., ∀A (∅ ⊆ A) B = {{∅, {∅}}}, and
C = {∅, {∅}, {∅, {∅}}}.
Their cardinalities are |A| = 2, |B| = 1, and |C| = 3. Also, A ⊆ C and B ⊆ C.
Proof. The proof is by contradiction: Assume that there is a set A for which Example 3.4. We have considered three relations between sets: ∈, =, and ⊆.
∅ 6⊆ A. This means that there exists an x for which x ∈ ∅ but x ∈ / A. But such Which of these relations hold for the following sets?
an x cannot exist because ∅ contains no element, which is a contradiction. A = {{∅}},
The above is a valid proof. Just to illustrate (as an example) that the same proof B = {{∅}, {∅, ∅}},
could be made more formal and more precise we can write the proof as follows, C = {∅, {∅}}, and
making use of logical transformation rules for formulas with quantifiers. Let A D = {∅, {∅, {∅}}}.
be an arbitrary (but fixed) set. The proof is by contradiction (see Definition 2.17), The answer is: B = A ⊆ C ∈ D.
where the statement S to be proved is ∅ ⊆ A and as the statement T we choose
¬∀x (x ∈/ ∅), which is false because it is the negation of the definition of ∅. The
proof that the negation of S implies T (step 3 in Definition 2.17) is as follows: 3.2.7 A Construction of the Natural Numbers
¬(∅ ⊆ A) ⇐⇒
.
¬∀x (x ∈ ∅ → x ∈ A) (def. of ∅ ⊆ A) We briefly discuss a way to define the natural numbers from basic set theory.
. We use bold-face font to denote objects that we define here as special sets, and
⇐⇒ ∃x ¬(x ∈ ∅ → x ∈ A) (¬∀x F ≡ ∃x ¬F ) then can observe that they can be seen as corresponding to the natural numbers
.
⇐⇒ ∃x ¬(¬(x ∈ ∅) ∨ x ∈ A) (def. of →) with the same symbol (but written in non-bold font). We define the sets
.
⇐⇒ ∃x (x ∈ ∅ ∧ ¬(x ∈ A)) (de Morgan’s rule)
. def def def def
=⇒ ∃x (x ∈ ∅) ∧ ∃x ¬(x ∈ A) (∃x (F ∧ G) |= (∃xF ) ∧ (∃xG)) 0 = ∅, 1 = {∅}, 2 = {∅, {∅}}, 3 = {∅, {∅}, {∅, {∅}}}, . . .
.
=⇒ ∃x (x ∈ ∅) (F ∧ G implies F )
. The successor of a set n, which we can denote by s(n), is defined as
⇐⇒ ¬∀x ¬(x ∈ ∅). (¬∀x F ≡ ∃x ¬F )
.
⇐⇒ ∅ is not the empty set (Definition 3.6) def
s(n) = n ∪ {n}.
which is false, and hence we have arrived at a contradiction. For example, we have 1 = s(0) and 2 = s(1). We note that |0| = 0, |1| = 1,
|2| = 2, |3| = 3, . . ., and, more generally, that |s(n)| = |n| + 1.
An operation + on these sets 0, 1, 2, 3, . . ., which corresponds to addition of
3.2.6 Constructing Sets from the Empty Set numbers, can be defined recursively as follows:
At this point, the only set we know to exist, because we have postulated it, is def def
the empty set. We can hence construct new sets ∅. The set {∅} is a set with a m+0 = m and m+s(n) = s(m+n).
single element (namely ∅). It is important to note that {∅} is not the empty set
One can also define a multiplication operation and prove that these operations
10 Wetake it for granted that ∅ is actually a set. But in an axiomatic treatment of set theory, this satisfy the usual laws of the natural numbers (commutative, associative, and
must be stated as an axiom. distributive laws).
47 Chapter 3. Sets, Relations, and Functions 3.3. Relations 48
3.2.8 The Power Set of a Set We point out that the Cartesian product is not associative, and in particular
Definition 3.7. The power set of a set A, denoted P(A), is the set of all subsets
×3i=1 Ai 6= (A1 × A2 ) × A3 .
of A:11
def
P(A) = {S| S ⊆ A}. 3.3 Relations
For a finite set with cardinality k, the power set has cardinality 2k (hence the
name ‘power set’ and the alternative notation 2A ). Relations are a fundamental concept in discrete mathematics and Computer Sci-
ence. Many special types of relations (e.g., equivalence relations, order relations,
Example 3.5. P({a, b, c}) = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}} and and lattices) capture concepts naturally arising in applications. Functions are
|P({a, b, c})| = 8. also a special type of relation.
Example 3.6. We have
P(∅) = {∅}, 3.3.1 The Relation Concept
P({∅}) = {∅, {∅}},
P({∅, {∅}}) = {∅, {∅}, {{∅}}, {∅, {∅}}},
{1, 7, 9} ∈ P(N). Definition 3.9. A (binary) relation ρ from a set A to a set B (also called an (A, B)-
relation) is a subset of A × B. If A = B, then ρ is called a relation on A.
3.2.9 The Cartesian Product of Sets Instead of (a, b) ∈ ρ one usually writes
Recall that two ordered pairs are equal if and only if both components agree, a ρ b,
i.e., and sometimes we write a6 ρ b if (a, b) ∈
/ ρ.
def
(a, b) = (c, d) ⇐⇒ a = c ∧ b = d.
Example 3.8. Let S be the set of students at ETH and let C be the set of courses
More generally, we denote an (ordered) list of k objects a1 , . . . , ak as (a1 , . . . , ak ). taught at ETH. Then a natural relation from S to C is the “takes” relation. If
Two lists of the same length are equal if they agree in every component. s ∈ S is a student and c ∈ C is a course, then (s, c) is in the relation if and only if
s takes course c. If we denote the relation by takes, we can write (s, c) ∈ takes
Definition 3.8. The Cartesian product A × B of two sets A and B is the set of or s takes y.12 We can also consider the set P of professors at ETH and the
all ordered pairs with the first component from A and the second component natural relation from P to C.
from B: Example 3.9. Let H be the set of all human beings (alive or dead). Then
A × B = (a, b) a ∈ A ∧ b ∈ B .
“child of” is a relation on H. If we denote the relation by childof, then
For finite sets, the cardinality of the Cartesian product of some sets is the (x, y) ∈ childof (or equivalently x childof y) means that x is y’s child. Other
product of their cardinalities: |A × B| = |A| · |B|. relations on H are “parent of”, “grandparent of”, “cousin of”, “ancestor of”,
“married to”, etc.
Example 3.7. Prove or disprove the following statements:
Example 3.10. On the integers Z we already know a number of very natural
(i) ∅ × A = ∅ . relations: =, 6=, ≤, ≥, <, >, | (the ‘divides’ relation), and 6 | (does not divide).
(ii) A × B = B × A . Example 3.11. The relation ≡m on Z is defined as follows:
More generally, the Cartesian product of k sets A1 , . . . , Ak is the set of all lists def
a ≡m b ⇐⇒ a − b = km for some k,
of length k (also called k-tuples) with the i-th component from Ai :
i.e., a ≡m b if and only if a and b have the same remainder when divided by m.
×ki=1 Ai = {(a1 , . . . , ak )| ai ∈ Ai for 1 ≤ i ≤ k} (See Section 4.2.)
11 In axiomatic set theory, the existence of the power set of every set must be postulated as an 12 Note that the relation takes can change over time, and in such an example we consider the
Example 3.12. The relation {(x, y)| x2 + y 2 = 1} on R is the set of points on the An alternative representation of a relation ρ from A to B is by a directed
unit circle, which is a subset of R × R. graph with |A| + |B| vertices14 labeled by the elements of A and B. The graph
contains the edge15 from a to b if and only if a ρ b. For a relation on a set A, the
Example 3.13. For any set S, the subset relation (⊆) is a relation on P(S). graph contains only |A| vertices, but it can contain loops (edges from a vertex to
itself).
Example 3.14. Two special relations from A to B are the empty relation (i.e., the
empty set ∅) and the complete relation A × B consisting of all pairs (a, b).
3.3.3 Set Operations on Relations
Definition 3.10. For any set A, the identity relation on A, denoted idA (or simply
id), is the relation idA = {(a, a)| a ∈ A}. Relations from A to B are sets, and therefore we can apply any operation defined
2
on sets: union, intersection, and complement. In the matrix representation of
Relations on a finite set are of special interest. There are 2n different rela- relations, these operations correspond to the position-wise logical OR, AND,
tions on a set of cardinality n. (Why?) and negation, respectively. A relation can also be a subset of another relation.
The relation concept can be generalized from binary to k-ary relations for Example 3.17. On the set Z, the relation ≤ ∪ ≥ is the complete relation,
given sets A1 , . . . , Ak . A k-ary relation is a subset of A1 × · · ·× Ak . Such relations ≤ ∩ ≥ is the identity relation, and the complement of ≤ is the relation >.
play an important role in modeling relational databases. Here we only consider Moreover, we have < ⊆ ≤ and = ⊆ ≥.
binary relations.
Example 3.18. For any relatively prime integers m and n, the relation ≡m ∩ ≡n
is ≡mn , as will be shown in Chapter 4. More generally, For general m and n,
3.3.2 Representations of Relations the relation ≡m ∩ ≡n is ≡lcm(m,n) , where lcm(m, n) denotes the least common
multiple of m and n.
For finite sets A and B, a (binary) relation ρ from A to B can be represented as a
Boolean |A|× |B| matrix M ρ with the rows and columns labeled by the elements
ρ 3.3.4 The Inverse of a Relation
of A and B, respectively. For a ∈ A and b ∈ B, the matrix entry Ma,b is 1 if a ρ b,
and 0 otherwise.
Definition 3.11. The inverse of a relation ρ from A to B is the relation ρb from B
Example 3.15. Let A = {a, b, c, d} and B = {q, r, s, t, u}, and consider the re- to A defined by
lation ρ = {(a, r), (a, s), (a, u), (b, q), (b, s), (c, r), (c, t), (c, u), (d, s), (d, u)}. The def
ρb = (b, a) (a, b) ∈ ρ .
matrix representation is
q r s t u Note that for all a and b we have b ρb a ⇐⇒ a ρ b. An alternative notation
for the inverse of ρ is ρ−1 .
a 0 1 1 0 1 Example 3.19. Let H be the set of people, O the set of objects, and π the owner-
b 1 0 1 0 0
M ρ
= b is the “owned by” relation determining
ship relation from H to O. The inverse π
c 0 1 0 1 1 who owns an object.
d 0 0 1 0 1
Example 3.20. If φ is the parenthood relation on the set H of humans (i.e., a φ b
where the rows and columns are labeled by the elements of A and B, respec- if a is a parent of b), then the inverse relation φb is the childhood relation.
tively.
Example 3.21. On the set Z, the inverse of the relation ≤ is the relation ≥. The
For relations on a set A, the matrix is an |A| × |A| square matrix. inverse of id is again id.
In the matrix representation, taking the inverse of a relation corresponds to
Example 3.16. For the set A = {1, 2, 3, 4, 5}, the relations =, ≥, and ≤ correspond
the transposition of the matrix. In the graph representation, taking the inverse
to the identity matrix,13 the lower triangular matrix, and the upper triangular
corresponds to inverting the direction of all edges.
matrix, respectively.
14 German: Knoten
13 The identity relation (=) on any finite set corresponds to the identity matrix. 15 German: Kante
51 Chapter 3. Sets, Relations, and Functions 3.3. Relations 52
3.3.5 Composition of Relations Example 3.23. If φ is the parenthood relation on the set H of humans, then the
relation φ2 is the grand-parenthood relation.16
Definition 3.12. Let ρ be a relation from A to B and let σ be a relation from B In the matrix representation, composing two relations corresponds to a spe-
to C. Then the composition of ρ and σ, denoted ρ ◦ σ (or also ρσ), is the relation cial type of matrix multiplication. If the matrices are considered as integer ma-
from A to C defined by trices (with 0 and 1 entries), then after the multiplication all entries > 1 are set
to 1.17 In the graph representation the composition corresponds to the natural
def
ρ◦σ = (a, c) ∃b (a, b) ∈ ρ ∧ (b, c) ∈ σ . composition of the graphs, where a ρσ c if and only if there is a path from a
(over some b) to c.
The n-fold composition of a relation ρ on a set A with itself is denoted ρn . The proof of the following lemma is left as an exercise.
Lemma 3.7. The composition of relations is associative, i.e., we have ρ ◦ (σ ◦ φ) = Lemma 3.8. Let ρ be a relation from A to B and let σ be a relation from B to C. Then
(ρ ◦ σ) ◦ φ. the inverse ρc bρb.
σ of ρσ is the relation σ
Proof. We use the short notation ρσ instead of ρ ◦ σ. The claim of the lemma, 3.3.6 Special Properties of Relations
ρ(σφ) = (ρσ)φ, states an equality of sets, which can be proved by proving that
each set is contained in the other (see Section 3.2.3). We prove ρ(σφ) ⊆ (ρσ)φ;
Definition 3.13. A relation ρ on a set A is called reflexive if
the other inclusion is proved analogously. Suppose that (a, d) ∈ ρ(σφ). We
need to prove that (a, d) ∈ (ρσ)φ. For illustrative purposes, We provide two aρa
formulations of this proof, first as a text and then in logical formulas.
is true for all a ∈ A, i.e., if
Because (a, d) ∈ ρ(σφ), there exists b such that (a, b) ∈ ρ and (b, d) ∈ σφ. id ⊆ ρ.
The latter implies that there exists c such that (b, c) ∈ σ and (c, d) ∈ φ. Now,
(a, b) ∈ ρ and (b, c) ∈ σ imply that (a, c) ∈ ρσ, which together with (c, d) ∈ φ In other words, a relation is reflexive if it contains the identity relation id. In
implies (a, d) ∈ (ρσ)φ. the matrix representation of relations, reflexive means that the diagonal is all 1.
Now comes the more formal version of the same proof, where the justifica- In a graph representation, reflexive means that every vertex has a loop (an edge
tion for each step is omitted but should be obvious: from the vertex to itself).
. Example 3.24. The relations ≤, ≥, and | (divides) on Z \ {0} are reflexive, but
(a, d) ∈ ρ(σφ) =⇒ ∃b (a, b) ∈ ρ ∧ (b, d) ∈ σφ the relations < and > are not.
.
=⇒ ∃b (a, b) ∈ ρ ∧ ∃c (b, c) ∈ σ ∧ (c, d) ∈ φ
. Definition 3.14. A relation ρ on a set A is called irreflexive if a6 ρ a for all a ∈ A,
=⇒ ∃b∃c (a, b) ∈ ρ ∧ (b, c) ∈ σ ∧ (c, d) ∈ φ
. i.e., if ρ ∩ id = ∅.18
=⇒ ∃b∃c (a, b) ∈ ρ ∧ (b, c) ∈ σ ∧ (c, d) ∈ φ
.
=⇒ ∃c∃b (a, b) ∈ ρ ∧ (b, c) ∈ σ ∧ (c, d) ∈ φ
. Definition 3.15. A relation ρ on a set A is called symmetric if
=⇒ ∃c ∃b (a, b) ∈ ρ ∧ (b, c) ∈ σ ∧ (c, d) ∈ φ
. a ρ b ⇐⇒ b ρ a
=⇒ ∃c (a, c) ∈ ρσ ∧ (c, d) ∈ φ
.
=⇒ (a, d) ∈ (ρσ)φ. is true for all a, b ∈ A, i.e., if
ρ = ρb.
16 Note that the notation φ2 is actually ambiguous; it could also denote the Cartesian product
In the matrix representation of relations, symmetric means that the matrix is 3.3.7 Transitive Closure
symmetric (with respect to the diagonal).
A symmetric relation ρ on a set A can be represented as an undirected graph, The reader can verify as an exercise that for a transitive relation ρ we have ρn ⊆ ρ
for all n > 1.
possibly with loops from a node to itself.
Example 3.25. The relation ≡m on Z is symmetric. Definition 3.18. The transitive closure of a relation ρ on a set A, denoted ρ∗ , is
[
Example 3.26. The “married to” relation on the set H of humans is symmetric. ρ∗ = ρn .
n∈N\{0}
Definition 3.16. A relation ρ on a set A is called antisymmetric if In the graph representation of a relation ρ on A, we have a ρk b if and only
if there is a walk of length k from a to b in the graph, where a walk may visit a
a ρ b ∧ b ρ a =⇒ a = b
node multiple times. The transitive closure is the reachability relation, i.e., a ρ∗ b
is true for all a, b ∈ A, i.e., if if and only if there is a path (of arbitrary finite length) from a to b.
ρ ∩ ρb ⊆ id. Example 3.30. Consider the set P of all convex polygons. We can think of them
In a graph representation, antisymmetric means that there is no cycle of as being given as pieces of paper. By cutting a piece into two pieces with a
length 2, i.e., no distinct vertices a and b with edges both from a to b and from b straight cut one can obtain new polygons. Let be the relation defined as fol-
to a. Note that antisymmetric is not the negation of symmetric. lows: a b if and only if with a single straight-line cut (or no cut) one can ob-
tain b from a. Moreover, consider the covering relation ⊒, where a ⊒ b if and
Example 3.27. The relations ≤ and ≥ are antisymmetric, and so is the division only if a can completely cover b (if appropriately positioned). It is easy to see
relation | on N: If a | b and b | a, then a = b. But note that the division relation that ⊒ is reflexive, anti-symmetric, and transitive20 whereas is only reflexive
| on Z is not antisymmetric. Why? and antisymmetric. Note that ⊒ is the transitive closure of .
Example 3.28. The relations ≤, ≥, <, >, |, and ≡m on Z are transitive. Definition 3.19. An equivalence relation is a relation on a set A that is reflexive,
symmetric, and transitive.
Example 3.29. Let ρ be the ancestor relation on the set H of humans, i.e., a ρ b if
a is an ancestor of b. This relation is transitive. Example 3.31. The relation ≡m is an equivalence relation on Z.
Lemma 3.9. A relation ρ is transitive if and only if ρ2 ⊆ ρ. Definition 3.20. For an equivalence relation θ on a set A and for a ∈ A, the set
of elements of A that are equivalent to a is called the equivalence class of a and is
denoted as [a]θ :21
Proof. The “if” part of the theorem (⇐=) follows from the definition of compo-
def
sition: If a ρ b and b ρ c, then a ρ2 c. Therefore also a ρ c since ρ2 ⊆ ρ.19 This [a]θ = {b ∈ A| b θ a}.
means transitivity.
Two trivial equivalence relations on a set A are the complete relation A × A,
Proof of the “only if” part (=⇒): Assume that ρ is transitive. To show that ρ2 ⊆ ρ,
for which there is only one equivalence class A, and the identity relation for
assume that a ρ2 b for some a and b. We must prove that a ρ b. The definition
which the equivalence classes are all singletons22 {a} for a ∈ A.
of a ρ2 b states that there exists c such that a ρ c and c ρ b. Transitivity of ρ
20 Such a relation will be defined below as a partial order relation.
thus implies that a ρ b, which concludes the proof.
21 When the relation θ is understood, we can drop the subscript θ.
19 In set-theoretic notation: (a, c) ∈ ρ2 ∧ ρ2 ⊆ ρ =⇒ (a, c) ∈ ρ. 22 A singleton is a set with one element.
55 Chapter 3. Sets, Relations, and Functions 3.4. Equivalence Relations 56
Example 3.32. The equivalence classes of the relation ≡3 are the sets and
[0] = {. . . , −6, −3, 0, 3, 6, . . .}, a 6 θ b =⇒ [a] ∩ [b] = ∅.
[1] = {. . . , −5, −2, 1, 4, 7, . . .}, To prove the first statement we consider an arbitrary c ∈ [a] and observe that
[2] = {. . . , −4, −1, 2, 5, 8, . . .}. .
c ∈ [a] ⇐⇒ c θ a (def. of [a])
2
Example 3.33. Consider the set R of points (x, y) in the plane, and consider the .
=⇒ c θ b (use a θ b and transitivity)
relation ρ defined by (x, y) ρ (x′ , y ′ ) ⇐⇒ x + y = x′ + y ′ . Clearly, this relation is .
⇐⇒ c ∈ [b] (def. of [b].)
reflexive, symmetric, and transitive. The equivalence classes are the set of lines
in the plane parallel to the diagonal of the second quadrant. Note that c ∈ [a] =⇒ c ∈ [b] (for all c ∈ A) is the definition of [a] ⊆ [b]. The state-
The proof of the following theorem is left as an exercise. ment [b] ⊆ [a] is proved analogously but additionally requires the application of
symmetry. (This is an exercise.) Together this implies [a] = [b].
Lemma 3.10. The intersection of two equivalence relations (on the same set) is an The second statement is proved by contradiction. Suppose it is false23 , i.e.,
equivalence relation. a 6 θ b and [a] ∩ [b] 6= ∅, i.e., there exists some c ∈ [a] ∩ [b], which means that c θ a
and c θ b. By symmetry we have a θ c and thus, by transitivity, we have a θ b,
Example 3.34. The intersection of ≡5 and ≡3 is ≡15 . a contradiction. (As an exercise, the reader can write this proof as a sequence of
implications.)
3.4.2 Equivalence Classes Form a Partition
3.4.3 Example: Definition of the Rational Numbers
Definition 3.21. A partition of a set A is a set of mutually disjoint subsets of A
that cover A, i.e., a set {Si |i ∈ I} of sets Si (for some index set I) satisfying We consider the set A = Z × (Z\{0}) and define the equivalence relation ∼ on
[ A as follows:
Si ∩ Sj = ∅ for i 6= j and Si = A. def
i∈I
(a, b) ∼ (c, d) ⇐⇒ ad = bc.
This relation is reflexive ((a, b) ∼ (a, b) since ab = ba), symmetric (since ad =
Consider any partition of a set A and define the relation ≡ such that two
bc =⇒ cb = da), and transitive. For the latter, assume (a, b) ∼ (c, d) and
elements are ≡-related if and only if they are in the same set of the partition. It
(c, d) ∼ (e, f ). Then ad = bc and cf = de, and thus adcf = bcde. Canceling d
is easy to see that this relation is an equivalence relation. The following theorem
(which is 6= 0) gives
states that the converse also holds. In other words, partitions and equivalence
relations capture the same (simple) abstraction. acf = bce.
Definition 3.22. The set of equivalence classes of an equivalence relation θ, de- We have to consider the cases c 6= 0 and c = 0 separately. If c 6= 0, then c can be
noted by canceled, giving af = be. If c = 0, then a = 0 since d 6= 0 but ad = bc. Similarly,
def e = 0, and thus we also have af = be. Therefore ∼ is transitive and hence an
A/θ = [a]θ a ∈ A , equivalence relation.
is called the quotient set of A by θ, or simply A modulo θ, or A mod θ. To every equivalence class [(a, b)] we can associate the rational number a/b
(b 6= 0). It is easy to see that all pairs (u, v) ∈ [(a, b)] correspond to the same ra-
Theorem 3.11. The set A/θ of equivalence classes of an equivalence relation θ on A is tional number, and two distinct rational numbers correspond to distinct equiv-
a partition of A. alence classes. Thus24
def
Q = Z × (Z\{0}) ∼.
Proof. Since a ∈ [a] for all a ∈ A (reflexivity of θ), the union of all equivalence
classes is equal to A. It remains to prove that equivalence classes are disjoint. 23 Recall that ¬(A → B) ≡ A ∧ ¬B.
24 This
This is proved by proving, for any fixed a and b, that is a more fancy way of saying that two rational numbers a/b and c/d are the same number
if and only if the ratio is the same. But actually, this is the definition of the rational numbers. If the
a θ b =⇒ [a] = [b] reader is surprised, he or she is challenged to come up with a simpler definition.
57 Chapter 3. Sets, Relations, and Functions 3.5. Partial Order Relations 58
3.5 Partial Order Relations Example 3.39. (Z; ≤) and (Z; ≥) are totally ordered posets (or simply totally
ordered sets), and so is any subset of Z with respect to ≤ or ≥. For instance,
3.5.1 Definition ({2, 5, 7, 10}; ≤) is a totally ordered set.
Example 3.40. The poset (P(A); ⊆) is not totally ordered if |A| ≥ 2, nor is the
Taking the definition of an equivalence relation and simply replacing the sym-
poset (N; |).
metry condition by the anti-symmetry condition results in a completely differ-
ent, but even more interesting type of relation.
3.5.2 Hasse Diagrams
Definition 3.23. A partial order (or simply an order relation25 ) on a set A is a
relation that is reflexive, antisymmetric, and transitive. A set A together with
a partial order on A is called a partially ordered set (or simply poset) and is Definition 3.26. In a poset (A; ) an element b is said to cover28 an element a if
denoted as (A; ).26 a ≺ b and there exists no c with a ≺ c and c ≺ b (i.e., between a and b).
In a graph representation of relations, a partial order has no cycles (but this Example 3.41. In a hierarchy (say of a company), if a ≺ b means that b is superior
is of course not a complete characterization). to a, then b covers a means that b is the direct superior of a.
Example 3.35. The relations ≤ and ≥ are partial orders on Z, Q, or R. The
relations < and > are not partial orders because they are not reflexive (though Definition 3.27. The Hasse diagram of a (finite) poset (A; ) is the directed graph
they are both transitive and, in fact, also antisymmetric because a < b ∧ b < a whose vertices are labeled with the elements of A and where there is an edge
b = ∅).
is never true, i.e., < ∩ < from a to b if and only if b covers a.
Example 3.36. The division relation (|) is a partial order on N \ {0} or any subset The Hasse diagram is a graph with directed edges. It is usually drawn such
of N \ {0}. that whenever a ≺ b, then b is placed higher than a. This means that all arrows
are directed upwards and therefore can be omitted.
Example 3.37. The subset relation on the power set of a set A is a partial order.
In other words, for any set A, (P(A); ⊆) is a poset. Example 3.42. The Hasse diagram of the poset ({2, 3, 4, 5, 6, 7, 8, 9}; |) is shown
in Figure 3.1 on the left.
Example 3.38. The covering relation on convex polygons (see Example 3.30) is
a partial order. Example 3.43. A nicer diagram is obtained when A is the set of all divisors of
an integer n. The Hasse diagram of the poset ({1, 2, 3, 4, 6, 8, 12, 24}; |) is shown
For a partial order relation we can define the relation a ≺ b similar to how in Figure 3.1 in the middle.
the relation < is obtained from ≤:
def Example 3.44. The Hasse diagram of the poset (P({a, b, c}); ⊆) is shown in Fig-
a ≺ b ⇐⇒ a b ∧ a 6= b. ure 3.1 on the right.
Example 3.45. For the two Hasse diagrams in Figure 3.2, give a realization as
Definition 3.24. For a poset (A; ), two elements a and b are called comparable27 the divisibility poset of a set of integers.
if a b or b a; otherwise they are called incomparable.
Example 3.46. Consider the covering29 relation on the convex polygons dis-
Definition 3.25. If any two elements of a poset (A; ) are comparable, then A is cussed in Example 3.30. A polygon a is covered by a polygon b if b can be
called totally ordered (or linearly ordered) by . placed on top of a such that a disappears completely. Are there sets of six poly-
gons resulting in a poset with the left (or right) Hasse diagram in Figure 3.2?
25 German: Ordnungsrelation
26 Partial orders are often denoted by ≤ or by a similar symbol like or ⊑. 28 German: überdecken
27 German: vergleichbar 29 The term “cover” is used here in a physical sense, not in the sense of Definition 3.26.
59 Chapter 3. Sets, Relations, and Functions 3.5. Partial Order Relations 60
{a,b,c}
24 The relation ≤lex is the well-known lexicographic order of pairs, usually con-
8 12 sidered when both posets are identical. The lexicographic order ≤lex is useful
8 {a,c} because if both (A; ) and (B; ⊑) are totally ordered (e.g. the alphabetical order
{a,b} {b,c}
of the letters), then so is the lexicographic order on A × B (prove this!).
6 9 4 6
4 {a} {b} {c} The lexicographic order can easily be generalized to the k-tuples over some
alphabet Σ (denoted Σk ) and more generally to the set Σ∗ of finite-length strings
2 3 over Σ. The fact that a total order on the alphabet results in a total order on Σ∗
is well-known: The telephone book has a total order on all entries.
2 3 5 7 1 {}
Figure 3.1: The Hasse diagrams of the posets ({2, 3, 4, 5, 6, 7, 8, 9}; |), 3.5.4 Special Elements in Posets
({1, 2, 3, 4, 6, 8, 12, 24}; |), and (P({a, b, c}); ⊆).
We define a few types of special elements that a poset can have.
Definition 3.29. Let (A; ) be a poset, and let S ⊆ A be some subset of A. Then
1. a ∈ A is a minimal (maximal) element of A if there exists no b ∈ A with b ≺ a
(b ≻ a).31
2. a ∈ A is the least (greatest) element of A if a b (a b) for all b ∈ A.32
3. a ∈ A is a lower (upper) bound33 of S if a b (a b) for all b ∈ S.34
4. a ∈ A is the greatest lower bound (least upper bound) of S if a is the greatest
Figure 3.2: Two Hasse diagrams. (least) element of the set of all lower (upper) bounds of S.35
bound can be outside of the considered subset S (and therefore need not be unique).
30 Recall def 35 Note that for a poset (A; ) and a subset S ⊆ A, restricting to S results in a poset (S; ).
that for a partial order we can define the relation ≺ as a ≺ b ⇐⇒ a b ∧ a 6= b.
61 Chapter 3. Sets, Relations, and Functions 3.6. Functions 62
poset is special in that any set of elements has a greatest lower bound and a least having introduced relations, because functions are a special type of relation, and
upper bound. How can glb(S) and lub(S) be defined? several concepts defined for relations (e.g. inversion and composition) apply to
functions as well.
Example 3.49. The poset (P({a, b, c}); ⊆) shown in Figure 3.1 has both a least
element, namely ∅, and a greatest element, namely {a, b, c}. Definition 3.33. A function f : A → B from a domain39 A to a codomain40 B is
+
Example 3.50. In the poset (Z ; |), 1 is a least element but there is no greatest a relation from A to B with the special properties (using the relation notation
element. a f b):41
1. ∀a ∈ A ∃b ∈ B af b (f is totally defined),
Definition 3.30. A poset (A; ) is well-ordered36 if it is totally ordered and if
′ ′ ′
every non-empty subset of A has a least element.37 2. ∀a ∈ A ∀b, b ∈ B af b ∧ af b → b=b (f is well-defined).
Note that every totally ordered finite poset is well-ordered. The property of As the reader certainly knows, a function f can be understood as a mapping
being well-ordered is of interest only for infinite posets. The natural numbers N from A to B, assigning to every a ∈ A a unique element in B, usually denoted
are well-ordered by ≤. Any subset of the natural numbers is also well-ordered. as f (a). One writes f : A → B to indicate the domain and codomain of f , and
More generally, any subset of a well-ordered set is well-ordered (by the same
order relation). f : a 7→ “expression in a”
One can generalize the function concept by dropping the first condition (to-
Definition 3.31. Let (A; ) be a poset. If a and b (i.e., the set {a, b} ⊆ A) have a tally defined), i.e., allowing that there can exist elements a ∈ A for which f (a) is
greatest lower bound, then it is called the meet of a and b, often denoted a ∧ b. not defined.
If a and b have a least upper bound, then it is called the join of a and b, often
denoted a ∨ b. Definition 3.35. A partial function A → B is a relation from A to B such that
condition 2. above holds.
Definition 3.32. A poset (A; ) in which every pair of elements has a meet and
a join is called a lattice38 . Two (partial) functions with common domain A and codomain B are equal
if they are equal as relations (i.e., as sets). f = g is equivalent to saying that the
Example 3.51. The posets (N; ≤), (N \ {0}; | ), and (P(S); ⊆) are lattices, as the function values of f and g agree for all arguments (including, in case of partial
reader can verify. functions, whether or not it is defined).
Example 3.52. The poset ({1, 2, 3, 4, 6, 8, 12, 24}; |) shown in Figure 3.1 is a lat- Definition 3.36. For a function f : A → B and a subset S of A, the image43 of S
tice. The meet of two elements is their greatest common divisor, and their join under f , denoted f (S), is the set
is the least common multiple. For example, 6 ∧ 8 = 2, 6 ∨ 8 = 24, 3 ∧ 4 = 1, and def
3 ∨ 4 = 12. In contrast, the poset ({2, 3, 4, 5, 6, 7, 8, 9}; |) is not a lattice. f (S) = {f (a) | a ∈ S}.
Definition 3.37. The subset f (A) of B is called the image (or range) of f and is
3.6 Functions also denoted Im(f ).
The concept of a function is perhaps the second most fundamental concept in 39 German: Definitionsbereich
40 German: Bildbereich, Wertebereich
mathematics (after the concept of a set). We discuss functions only now, after 41 Here we use the convenient notation ∀a ∈ A and ∃b ∈ B.
36 German: wohlgeordnet 42 This notation is motivated by the fact that if A and B are finite, then there are |B||A| such
37 The least element is defined naturally (see Definition 3.29). functions.
38 German: Verband 43 German: Bild
63 Chapter 3. Sets, Relations, and Functions 3.7. Countable and Uncountable Sets 64
Example 3.53. Consider the function f : R → R defined by f (x) = x2 . The 3.7 Countable and Uncountable Sets
image of the interval [2, 3] is the interval [4, 9]. The range of f is the set R≥0 of
non-negative real numbers. 3.7.1 Countability of Sets
Countability is an important concept in Computer Science. A set that is count-
Definition 3.38. For a subset T of B, the preimage44 of T , denoted f −1 (T ), is the
able can be enumerated by a program (even though this would take unbounded
set of values in A that map into T :
time), while an uncountable set can, in principle, not be enumerated.
def
f −1 (T ) = {a ∈ A| f (a) ∈ T }. Definition 3.42.
(i) Two sets A and B equinumerous47 , denoted A ∼ B, if there exists a bijection
Example 3.54. Consider again the function f (x) = x2 . The preimage of the A → B.
interval [4, 9] is [−3, −2] ∪ [2, 3].
(ii) The set B dominates the set A, denoted A B, if A ∼ C for some subset
C ⊆ B or, equivalently, if there exists an injective function A → B.
Definition 3.39. A function f : A → B is called
(iii) A set A is called countable48 if A N, and uncountable49 otherwise.50
1. injective (or one-to-one or an injection) if for a 6= b we have f (a) 6= f (b), i.e.,
no two distinct values are mapped to the same function value (there are Example 3.56. The set Z = {. . . , −2, −1, 0, 1, 2, . . .} of integers is countable, and
no “collisions”). Z ∼ N. A bijection f : N → Z is given by f (n) = (−1)n ⌈n/2⌉.
2. surjective (or onto) if f (A) = B, i.e., if for every b ∈ B, b = f (a) for some
a ∈ A (every value in the codomain is taken on for some argument). Lemma 3.15.
3. bijective (or a bijection) if it is both injective and surjective. (i) The relation is transitive: A B ∧ B C =⇒ A C.
(ii) A ⊆ B =⇒ A B.
Definition 3.40. For a bijective function f , the inverse (as a relation, see Defini-
tion 3.11) is called the inverse function45 of f , usually denoted as f −1 . Proof. Proof of (i). If there is an injection from A to B and also an injection from
B to C, then their composition is an injection from A to C. (We omit the proof
Definition 3.41. The composition of a function f : A → B and a function g : B → of this statement.)
C, denoted by g ◦f or simply gf , is defined by (g ◦f )(a) = g(f (a)).46 Proof of (ii). If A ⊆ B, then the identity function on A is an injection from A
to B.
Example 3.55. Consider again the function f (x) = x3 + 3 and g(x) = 2x2 + x.
Then g ◦ f (x) = 2(f (x))2 + f (x) = 2x6 + 13x3 + 21. A non-trivial theorem, called the Bernstein-Schröder theorem, is stated with-
out proof.51 It is not needed in this course.
Lemma 3.14. Function composition is associative, i.e., (h ◦ g) ◦ f = h ◦ (g ◦ f ). Theorem 3.16. A B ∧ B A =⇒ A ∼ B.
Proof. This is a direct consequence of the fact that relation composition is asso- 3.7.2 Between Finite and Countably Infinite
ciative (see Lemma 3.7).
For finite sets A and B, we have A ∼ B if and only if |A| = |B|. A finite set has
44 German: Urbild never the same cardinality as one of its proper subsets. Somewhat surprisingly,
45 It is easy to see that this is a function
46 Note that the composition of functions is the same as the composition of relations. However, 47 German: gleich mächtig
48 German: abzählbar
unfortunately, different notation is used: The composition of relations f and g is denoted f ◦g while,
49 German: überabzählbar
if considered as functions, the same resulting composition is denoted as g ◦ f . (The reason is that
50 Recall that N = {0, 1, 2, 3, . . .}.
one thinks of functions as mapping “from right to left”.) Because of this ambiguity one must make
explicit whether the symbol ◦ refers to function or relation composition. 51 An elegant proof of this theorem is given in Proofs from THE BOOK by M. Aigner and G. Ziegler.
65 Chapter 3. Sets, Relations, and Functions 3.7. Countable and Uncountable Sets 66
for infinite sets this is possible. Proof. A possible bijection f : N → N2 is given by f (n) = (k, m), where k
and m are determined using the following equations: k + m = t − 1 and
Example 3.57. Let O = {1, 3, 5, . . .} be the set of odd natural numbers. Of m = n − 2t , where t > 0 is the smallest integer such that t+1 > n. This cor-
2
course, O is countable since the identity function is a (trivial) injection from responds to the enumeration of the pairs (k, m) along diagonals with constant
O to N. Actually, there is even a bijection f : N → O, namely f (n) = 2n + 1. sum k + m. More concretely, we enumerate the pairs as follows: (0, 0), (1, 0),
Indeed, Theorem 3.17 below states a more general fact. (0, 1), (2, 0), (1, 1), (0, 2), (3, 0), (2, 1), (1, 2), (0, 3), (4, 0), (3, 1), · · ·. It is easy to
see that this is a bijection N → N2 , hence N2 is countable.
Theorem 3.17. A set A is countable if and only if it is finite or if A ∼ N. An alternative proof works as follows. We have N ∼ {0, 1}∗ and hence N ×
N ∼ {0, 1}∗ × {0, 1}∗. Hence it suffices to show a bijection {0, 1}∗ × {0, 1}∗ →
The theorem can be restated as follows: There is no cardinality level between {0, 1}∗. The concatenation of the two sequences is not a bijection because a bit-
finite and countably infinite. string can be split in many ways into two strings. One possibility to map a pair
(a, b) of bit-strings to a single bit-string is as follows:
Proof. A statement of the form “if and only if” has two directions. To prove the
(a, b) 7→ 0|a| ||1||a||b,
direction ⇐=, note that if A is finite, then it is countable, and also if A ∼ N, then
A is countable. where we first encode the length |a| of a and then append a and b.
To prove the other direction (=⇒), we prove that if A is countable and infinite, Corollary 3.20. The Cartesian product A × B of two countable sets A and B is count-
then A ∼ N. According to the definition, A N means that there is a bijection able, i.e., A N ∧ B N =⇒ A × B N.
f : A → C for a set C ⊆ N. For any infinite subset of N, say C, one can define a
bijection g : C → N as follows. According to the well-ordering principle, there Proof. We first prove A×B N×N by exhibiting an injection g : A×B → N×N,
exists a least element of C, say c0 . Define g(c0 ) = 0. Define C1 = C \ {c0 }. namely g(a, b) = (f1 (a), f2 (b)). That g is an injection can be proved as follows:
Again, according to the well-ordering principle, there exists a least element of .
(a, b) 6= (a′ , b′ ) =⇒ a 6= a′ ∨ b 6= b′ (definition of pairs)
C1 , say c1 . Define g(c1 ) = 1. This process can be continued, defining inductively .
a bijection g : C → N. Now g ◦ f is a bijection A → N, which proves A ∼ N. =⇒ f1 (a) 6= f1 (a′ ) ∨ f2 (b) 6= f2 (b′ ) (f1 and f2 are injections)
.
=⇒ f1 (a), f2 (b) 6= f1 (a′ ), f2 (b′ ) (definition of pairs).
Using A × B N × N (just proved) and N × N N (Theorem 3.19) now gives
3.7.3 Important Countable Sets
A × B N because is transitive (Lemma 3.15(i)).
Corollary 3.21. The rational numbers Q are countable.
def
Theorem 3.18. The set {0, 1}∗ = {ǫ, 0, 1, 00, 01, 10, 11, 000, 001, . . .} of finite binary
sequences is countable.52 Proof. Every rational number can be represented uniquely as a pair (m, n) where
m ∈ Z, n ∈ N \ {0}, and where m and n are relatively prime. Hence Q Z × N.
According to Example 3.56, Z is countable, i.e., Z N. Thus, according to
Proof. We could give an enumeration of the set {0, 1}∗, i.e., a bijection be- Corollary 3.20, Z × N N. Hence, using transitivity of , we have Q N (i.e.,
tween {0, 1}∗ and N, but to prove the theorem it suffices to provide an injection Q is countable).
{0, 1}∗ → N, which we define as follows. We put a “1” at the beginning of the
string and then interpret it as an natural number using the usual binary repre- The next theorem provides some other important sets that are countable.
sentation of the natural numbers. For example, the string 0010 is mapped to the
number 18.53 Theorem 3.22. Let A and Ai for i ∈ N be countable sets.
(i) For any n ∈ N, the set An of n-tuples over A is countable.
Theorem 3.19. The set N×N (= N2 ) of ordered pairs of natural numbers is countable.
(ii) The union ∪i∈N Ai of a countable list A0 , A1 , A2 , . . . of countable sets is count-
52 Hereǫ denotes the empty string. able.
53 Notethat without prepending a 1, different strings (e.g. 0010 and 00010) would result in the (iii) The set A∗ of finite sequences of elements from A is countable.
same integer and hence the mapping would not be an injection.
67 Chapter 3. Sets, Relations, and Functions 3.7. Countable and Uncountable Sets 68
Proof. Statement (i) can be proved by induction. The (trivial) induction basis is Theorem 3.23. The set {0, 1}∞ is uncountable.
that A1 = A is countable. The induction step shows that if An is countable, then
also An+1 ∼ An × A is countable. This follows from Corollary 3.20 because both
An and A are countable. Proof. This is a proof by contradiction. To arrive at a contradiction, assume that
We omit the proof of (ii). a bijection f : N → {0, 1}∞ exists.55 Let βi,j be the jth bit in the i-th sequence
We now prove (iii), which implies (i), and hence gives an alternative proof f (i), where for convenience we begin numbering the bits with j = 0:
for (i). We define an injection A∗ → {0, 1}∗. This is achieved by using an ar-
def
bitrary injection f : A → {0, 1}∗ and defining the natural injection g : A∗ → f (i) = βi,0 , βi,1 , βi,2 , βi,3 , . . . .
({0, 1}∗)∗ as follows: For a sequence of length n in A∗ , say (a1 , . . . , an ), we let
Let b be the complement of a bit b ∈ {0, 1}. We define a new semi-infinite binary
g(a1 , . . . , an ) = f (a1 ), . . . , f (an ) , sequence α as follows:
i.e., each element in the sequence is mapped separately using f . Now it only def
α = β0,0 , β1,1 , β2,2 , β3,3 , . . . .
remains to demonstrate an injection ({0, 1}∗)∗ → {0, 1}∗, which can be achieved
as follows.54 We replace every 0-bit in a sequence by 00 and every 1-bit by 01, Obviously, α ∈ {0, 1}∞, but there is no n ∈ N such that α = f (n) since α
which defines a (length-doubling) injection {0, 1}∗ → {0, 1}∗. Then we con- is constructed so as to disagree in at least one bit (actually the nth bit) with
catenate all obtained expanded sequences, always separated by 11. This is an every sequence f (n) for n ∈ N. This shows that f cannot be a bijection, which
injection because the separator symbols 11 can be detected and removed and concludes the proof.
the extra 0’s can also be removed. Hence a given sequence can be uniquely de-
composed into the component sequences, and hence no two sequences of binary
This proof technique is known as Cantor’s diagonalization argument; it has
(component) sequences can result in the same concatenated sequence.
also other applications.
Example 3.58. We illustrate the above injection ({0, 1}∗)∗ → {0, 1}∗ by an ex- By interpreting the elements of {0, 1}∞ as the binary expansion of a real
ample. Consider the sequence (0100, 10111, 01, 1) of bit-sequences. Now 0100 number in the interval [0, 1], and vice versa, one can show that the interval [0, 1]
is mapped to 00010000, 10111 is mapped to 0100010101, etc. and the final con- (and hence R itself), is uncountable.56
catenated sequence is 000100001101000101011100011101, which can uniquely be
decomposed into the original four sequences.
3.7.5 Existence of Uncomputable Functions
The above theorem states that there are uncountably many functions N → {0, 1}.
3.7.4 Uncountability of {0, 1}∞ On the other hand, every computer program, regardless of the programming
language it is written in, corresponds to a finite string of symbols. Without loss
We now consider semi-infinite binary sequences (s0 , s1 , s2 , s3 , . . .). One can in-
of generality, one can think of a program as a finite binary sequence p ∈ {0, 1}∗).
terpret such a binary sequence as specifying a subset of N: If si = 1, then
Hence the set of programs is countable, whereas the set of functions N → {0, 1}
i is in the set, otherwise it is not. Equivalently, we can understand a semi-
is uncountable. If every program computes at most one function, there must be
infinite sequence (s0 , s1 , s2 , s3 , . . .) as a function N → {0, 1}, i.e., as a predi-
functions N → {0, 1} not computed by a program. This for Computer Science
cate on N. For example, the primality predicate prime : N → {0, 1} (where
fundamental consequence of Theorem 3.23 is stated below.
prime(n) = 1 if and only if n is prime) corresponds to the semi-infinite se-
quence 001101010001010001010001000001010000001 . . ..
Definition 3.44. A function f : N → {0, 1} is called computable if there is a
program that, for every n ∈ N, when given n as input, outputs f (n).
Definition 3.43. Let {0, 1}∞ denote the set of semi-infinite binary sequences or,
equivalently, the set of functions N → {0, 1}. 55 Here we make use of Theorem 3.17 which implies that {0, 1}∞ is countable if and only if such
a bijection exists.
54 Note that a simple concatenation of the sequences does not work because the concatenated 56 A subtlety, which is not a problem in the proof, is that some real numbers have two representa-
sequences can not uniquely be decomposed into the original sequences, i.e., this is not an injection. tions as bit-strings. For example, the number 0.5 has representations 10000000 · · · and 0111111 · · ·.
69 Chapter 3. Sets, Relations, and Functions
In fact, essentially all such functions are uncomputable. Those that are com-
putable are rare exceptions. For example, the function prime : N → {0, 1} is
computable.
Is there a specific uncomputable function? A prominent example is the so-
called Halting problem defined as follows: Given as input a program (encoded Chapter 4
as a bit-string or natural number) together with an input (to the program), de-
termine whether the program will eventually stop (function value 1) or loop
forever (function value 0) on that input. This function is uncomputable. This is
usually stated as: The Halting problem is undecidable.
This theorem can also be proved by a diagonalization argument similar to
Number Theory
the one above. The theorem has far-reaching consequences in theoretical and
practical Computer Science. It implies, for example, that it is impossible to write
a program that can verify (in the most general case) whether a given program
satisfies its specification, or whether a given program contains malicious parts. 4.1 Introduction
Number theory is one of the most intriguing branches of mathematics. For a
long time, number theory was considered one of the purest areas of mathemat-
ics, in the sense of being most remote from having applications. However, since
the 1970’s number theory has turned out to have intriguing and unexpected
applications, primarily in cryptography.
In this course we discuss only some basic number-theoretic topics that have
applications in Computer Science. In addition to the rich applications, a sec-
ond reason for discussing the basics of number theory in a course in Computer
Science is as a preparation for the chapter on algebra (Chapter 5).
which also includes topics like, for instance, algebraic extensions of the rational numbers.
70
71 Chapter 4. Number Theory 4.2. Divisors and Division 72
integers a, b, and c such that a2 + b2 = c2 . The answer is yes. Examples are 4.2 Divisors and Division
32 + 42 = 52 and 122 + 52 = 132 . A straight-forward generalization of this
question is whether there exist positive integers a, b, c, and n ≥ 3 such that an + 4.2.1 Divisors
bn = cn . The answer (no such integers exist) is known as Fermat’s last theorem,
which remained one of the most famous open conjectures until Andrew Wiles Definition 4.1. For integers a and b we say that a divides b, denoted a | b, if there
settled the question some years ago, using highly sophisticated mathematics. exists an integer c such that b = ac. In this case, a is called a divisor2 of b, and b
is called a multiple3 of a. If a 6= 0 and a divisor c exists it is called the4 quotient
when b is divided by a, and we write c = ab or c = b/a. We write a6 | b if a does
Example 4.3. The recent proof of the Catalan conjecture by Preda Mihailescu, not divide b.
who worked at ETH Zürich, is another break-through in number theory. This
Note that every non-zero integer is a divisor of 0. Moreover, 1 and −1 are
theorem states that the equation am − bn = 1 has no other integer solutions but
divisors of every integer.
32 − 23 = 1 (for m, n ≥ 2).
Theorem 4.1 (Euclid). For all integers a and d 6= 0 there exist unique integers q and
In this course we are trying to present a rigorous mathematical treatment of the r satisfying
material. Consequently, in order to present number theory, it appears that we a = dq + r and 0 ≤ r < |d|.
would first have to define the integers, so we know what we are talking about,
in contrast to the intuitive understanding of numbers acquired since the early Here a is called the dividend, d is called the divisor, q is called the quotient, and
years at school. However, such a formal, axiomatic treatment of the integers is r is called the remainder. The remainder r is often denoted as Rd (a) or sometimes
beyond the scope of the course. as a mod d.
In this chapter we take the usual approach where we assume that we know
what numbers and operations on numbers are and that we also know the basic Proof. We carry out this proof in a detailed manner, also to serve as an example
facts about numbers (e.g. the commutative, associative and distributive laws, of a systematic proof.
etc.) which we can use to prove statements. But we should point out that in such We define S to be the set of possible nonnegative remainders:
an informal approach it is difficult (if not impossible) to draw the dividing line
def
between facts that are well-known and facts that require a proof. For example, S = {s| s ≥ 0 and a = dt + s for some t ∈ Z}.
why is there no integer between 0 and 1, why is −0 = 0, and why is a2 ≥ 0
for all a ∈ Z? What is the complete list of facts we consider known, and which We prove the following three claims by first proving 1), then proving that 1)
facts require a proof? The answer is not clear unless one states a list of axioms. implies 2), and then proving that 2) implies 3).
For example, we will show an interesting proof of the fact that every number
can be factored uniquely into primes. This is definitely a theorem that requires 1) S is not empty.
a proof, even though, after many years of mathematical education, the reader 2) S contains an r < |d|.
may consider it a well-known basic fact. 3) The r of claim 2) is unique.
The integers are a special case of a mathematical structure called a ring, 2 German: Teiler
which will be discussed in Chapter 5. In this chapter we mention in a few places 3 German: Vielfaches
that concepts like divisors, greatest common divisors, ideals, etc. can be defined 4 One can prove that it is unique.
for any ring, not just for the integers. 5 German: Rest
73 Chapter 4. Number Theory 4.2. Divisors and Division 74
Proof of 1): We use case distinction and prove the statement for three cases (one Definition 4.3. For a, b ∈ Z (not both 0) one denotes the unique positive greatest
of which is always satisfied): common divisor by gcd(a, b) and usually calls it the greatest common divisor. If
Case 1: a ≥ 0. Then a = d0 + a and hence a ∈ S. gcd(a, b) = 1, then a and b are called relatively prime7 .
Case 2: a < 0 and d > 0. Then a = da + (1 − d)a and thus (1 − d)a ∈ S since
(1 − d)a ≥ 0 because both (1 − d) and a are ≤ 0. Lemma 4.2. For any integers m, n and q, we have
Case 3: a < 0 and d < 0. Then a = d(−a) + (1 + d)a and thus (1 + d)a ∈ S since
(1 + d)a ≥ 0 because both (1 + d) and a are ≤ 0. gcd(m, n − qm) = gcd(m, n).
Proof that 1) implies 2): Because S is not empty, it has a smallest element
Proof. It is easy to prove (as an exercise) that every common divisor of m and
(due to the well-ordering principle), which we denote by r. We now prove that
n − qm (and therefore also the greatest) is also a common divisor of m and n,
r < |d|, by contradiction, i.e., assuming r ≥ |d|. By definition of S we have
and vice versa.
a = dq + r for some q. We make a case distinction: d > 0 and d < 0. If d > 0,
then
a = d(q + 1) + (r − |d|), This lemma implies in particular that
hence r − |d| ≥ 0 and therefore r − |d| ∈ S, which means that r is not the smallest gcd(m, Rm (n)) = gcd(m, n),
element of S, a contradiction. If d < 0, then a = d(q − 1) + (r − |d|), and the same
argument as above shows that r − |d| ∈ S, a contradiction. which is the basis for Euclid’s well-known gcd-algorithm: Start with m < n and
Proof that 2) implies 3): It remains to prove that r is unique. We give a proof repeatedly replace the pair (m, n) by the pair (Rm (n), m) until the remainder
only for d > 0; the case d < 0 is analogous and is left as an exercise. The proof is 0, at which point the last non-zero number is equal to gcd(m, n).
is by contradiction. Suppose that there also exist r′ 6= r with 0 ≤ r′ < |d| and
Definition 4.4. For a, b ∈ Z, the ideal generated by a and b8 , denoted (a, b), is the
such that a = dq ′ + r′ for some q ′ . We distinguish the three cases q ′ = q, q ′ < q,
set
and q ′ > q. If q ′ = q, then r′ = a − dq ′ = a − dq = r, a contradiction since we
assumed r′ 6= r. If q ′ < q, then q − q ′ ≥ 1, so (a, b) := {ua + vb | u, v ∈ Z}.
Similarly, the ideal generated by a single integer a is
r′ = a − dq ′ = (a − dq) + d(q − q ′ ) ≥ r + d.
Since r′ ≥ r + d ≥ d, the condition 0 ≤ r′ < |d| is violated, which is a contra- (a) := {ua | u ∈ Z}.
diction. A symmetric argument shows that q ′ > q also results in a contradic-
tion, The following lemma implies that every ideal in Z can be generated by a
single integer.
4.2.3 Greatest Common Divisors Lemma 4.3. For a, b ∈ Z there exists d ∈ Z such that (a, b) = (d).
Definition 4.2. For integers a and b (not both 0), an integer d is called a greatest Proof. If a = b = 0, then d = 0. Assume now that at least one of the numbers is
common divisor6 of a and b if d divides both a and b and if every common divisor non-zero. Then (a, b) contains some positive numbers, so (by the well-ordering
of a and b divides d, i.e., if principle) let d be the smallest positive element in (a, b). Clearly (d) ⊆ (a, b)
since every multiple of d is also in (a, b). It remains to prove (a, b) ⊆ (d). For any
d | a ∧ d | b ∧ ∀c (c | a ∧ c | b) → c | d . c ∈ (a, b) there exist q and r with 0 ≤ r < d such that c = qd + r. Since both c
and d are in (a, b), so is r = c − qd. Since 0 ≤ r < d and d is (by assumption) the
The concept of a greatest common divisor applies not only to Z, but to more smallest positive element in (a, b), we must have r = 0. Thus c = qd ∈ (d).
general structures (e.g. polynomial rings). If d and d′ are both greatest common
divisors of a and b, then d | d′ and d′ | d. For the integers Z, this means that Lemma 4.4. Let a, b ∈ Z (not both 0). If (a, b) = (d), then d is a greatest common
d′ = ±d, i.e., there are two greatest common divisors. (But for more general divisor of a and b.
structures there can be more than two greatest common divisors.)
7 German: teilerfremd
6 Note that the term “greatest” does not refer to the order relation ≤ but to the divisibility relation. 8 German: durch a und b erzeugtes Ideal
75 Chapter 4. Number Theory 4.3. Factorization into Primes 76
Proof. d is a common divisor of a and b since a ∈ (d) and b ∈ (d). To show that d This notion of having only trivial divisors extends to other rings, for example
is a greatest common divisor, i.e., that every common divisor c of a and b divides the ring of polynomials over R. In such a general context, the property is called
d, note that c divides every integer of the form ua + vb, in particular d. irreducible rather than prime. The term prime is in general used for the property
that if p divides a product of elements, then it divides at least one of them (see
The following corollary follows from Lemmas 4.3 and 4.4. Lemma 4.7 below). For the integers, these two concepts are equivalent. The next
Corollary 4.5. For a, b ∈ Z (not both 0), there exist u, v ∈ Z such that lemma states one direction of this equivalence.
The following theorem is called the fundamental theorem of arithmetic.
gcd(a, b) = ua + vb.
Theorem 4.6. Every positive integer can be written uniquely (up to the order in which
Example 4.4. For a = 26 and b = 18 we have
factors are listed) as the product of primes.11
gcd(26, 18) = 2 = (−2) · 26 + 3 · 18.
Also, for a = 17 and b = 13 we have 4.3.2 Proof of the Fundamental Theorem of Arithmetic *
gcd(17, 13) = 1 = (−3) · 17 + 4 · 13. Lemma 4.7. If p is a prime which divides the product x1 x2 · · · xn of some integers x1 , . . . , xn ,
then p divides one of them, i.e., p | xi for some i ∈ {1, . . . , n}.
An extension of Euclid’s well-known gcd-algorithm allows to efficiently
compute not only gcd(a, b), but also u and v such that gcd(a, b) = ua + vb. Proof. The proof is by induction on n. The claim is trivially true for n = 1 (induction
basis). Suppose it is true for some general n (induction hypothesis). To prove the claim
for n + 1 (induction step), suppose that p | x1 · · · xn+1 . We let y := x1 · · · xn (and hence
4.2.4 Least Common Multiples p | yxn+1 ) and look at the two cases p | y and p6 | y separately. If p | y, then p | xi for
some 1 ≤ i ≤ n, due to the induction hypothesis, and we are done. If p6 | y, then, since p
The least common multiple is a dual concept of the greatest common divisor. has no positive divisor except 1 and p, we have gcd(p, y) = 1. By Corollary 4.5 there are
integers u and v such that up + vy = 1. Hence we have
Definition 4.5. The least common multiple l of two positive integers a and b, de-
noted l = lcm(a, b), is the common multiple of a and b which divides every xn+1 = (up + vy)xn+1 = (uxn+1 )p + v(yxn+1 ).
common multiple of a and b, i.e.,
Because p divides both terms (uxn+1 )p and v(yxn+1 ) in the sum on the right side, it
a | l ∧ b | l ∧ ∀m (a | m ∧ b | m) → l | m . follows that it also divides the sum, i.e., p | xn+1 , which concludes the proof.
bi > 0). Then for every i, pi | n and thus pi | q1b1 q2b2 · · · qsbs . Hence, by Lemma 4.7, Example 4.6. Consider now a slightly twisted version of the Gaussian integers:
pi | qj for some j and, because qj is a prime and pi > 1, we have pi = qj . Similarly for √ √
every j, qj = pi for some i. Thus the set of primes are identical, i.e., r = s and pi = qi Z[ −5] = {a + b 5i | a, b ∈ Z}.
for 1 ≤ i ≤ r. To show that the corresponding exponents ai and bi are also identical,
suppose that ai < bi for some i. We can divide both expressions by pai i , which results Like the Gaussian integers, this set is closed with respect to addition and multiplication
in two numbers that are equal, yet one is divisible by pi while the other is not. This is (of complex numbers). For example,
impossible since if two numbers are equal, then they have the same divisors. √ √ √
(a + b 5i)(c + d 5i) = ac − 5bd + (bc + ad) 5i.
√
The only units in Z[ −5] are 1 and −1. One can check√easily, by ruling √ out all possible
4.3.3 Expressing gcd and lcm divisors with smaller norm, that the elements 2, 3, 1 + 5i, and 1 − 5i are irreducible.
The element 6 can be factored in two different ways into irreducible elements:
The fundamental theorem of arithmetic assures that integers a and b can be writ-
√ √
ten as Y Y f 6 = 2 · 3 = (1 + 5i)(1 − 5i).
a= pei i and b= pi i .
i i
This product can be understood in two different ways. Either it is over all 4.3.5 Irrationality of Roots *
primes, where all but finitely many of the ei are 0, or it is over a fixed agreed set As a consequence of the unique prime factorization we can prove:
of primes. Either view is correct. Now we have √
Y min(e ,f ) Theorem 4.8. n is irrational unless n is a square (n = c2 for some c ∈ Z).
i i
gcd(a, b) = pi
√
i Proof. Suppose n = a/b for two integers a and b. Then a2 = nb2 . If n is not a square,
it contains at least one prime factor p with an odd power. Since the number of prime
and Y max(ei ,fi ) factors p in any square is even, we have a contradiction: a2 contains an even number of
lcm(a, b) = pi .
factors p while nb2 contains an odd number of factors p. This is impossible according to
i
Theorem 4.6.
It is easy to see that
gcd(a, b) · lcm(a, b) = ab Note that this proof is simpler and more general than the proof given in Example 2.28
because for all i we have because there we have not made use of the unique prime factorization of the integers.
min(ei , fi ) + max(ei , fi ) = ei + fi .
4.3.6 A Digression to Music Theory *
4.3.4 Non-triviality of Unique Factorization * An octave in music corresponds to the doubling of the frequency of a tone. Similarly, a
fifth12 corresponds to a ratio 3 : 2, a musical fourth13 corresponds to a ratio 4 : 3, and a
It is worth-while pointing out that this theorem is not self-evident, as it may appear to major and minor third14 correspond to the ratios 5 : 4 and 6 : 5, respectively.
the reader completely familiar with it. There are in fact examples of rings in which the No multiple of fifths (or fourths) yields a multiple of octaves, since otherwise we
unique factorization into irreducible elements does not hold. We give two examples, one would have ( 23 )n = 2m for some n and m, which is equivalent to 3n = 2m+n . This
with unique factorization into irreducible elements, and one without. implies that one cannot tune a piano so that all intervals are correct since on the piano
√
Example 4.5. Let i = −1 denote the complex imaginary unit. The Gaussian integers (considered to be extended to many octaves), one hits a higher octave after a certain
√ number (namely 12) of fifths. It can be viewed as a number-theoretic miracle that tuning
Z[i] = Z[ −1] = {a + bi | a, b ∈ Z}
a piano is nevertheless possible with only very small inaccuracies. If one divides the
are the complex numbers whose real and imaginary parts are both integers. Since the octave
√ into 12 equal (half-tone) intervals, a half-tone corresponds to a frequency ratio of
12
norm (as complex numbers) is multiplied when two elements of Z[i] are multiplied, the 2 ≈ 1.05946. Three, four, five, and seven half-tones yield frequency ratios of
units (actually four) are the elements with norm 1, namely 1, i, −1, and −i. Units are
21/4 = 1.1892 ≈ 6/5,
elements that divide every other element. An irreducible element p in Z[i] is an element
whose only divisors are units and associates of p (i.e., elements up where u is a unit). 12 German: Quinte
By a generalization of the arguments given for Z one can show that factorization into 13 German: Quarte
irreducible elements is unique in Z[i]. (This is not obvious.) 14 German: Terz
79 Chapter 4. Number Theory 4.4. Some Basic Facts About Primes * 80
21/3 = 1.2599 ≈ 5/4, The following theorem proved by Hadamard and de la Vallée Poussin in the 19th
25/12 = 1.33482 ≈ 4/3, and century states that the density of primes ≤ x is approximately 1/ ln(x). This shows that
if one tries to find a large (say 1024 bit) prime, for instance for cryptographic purposes,
27/12 = 1.49828 ≈ 3/2,
then a randomly selected odd integer has reasonable chances of being prime. Much more
approximating the minor third, major third, fourth, and fifth astonishingly well. precise estimates for π(x) are known today.
One can view these relations also as integer approximations. For example, we have π(x) ln(x)
Theorem 4.11. lim = 1.
531′ 441 = 312 ≈ 219 = 524′ 288, which implies that ( 32 )12 ≈ 27 , i.e., 12 fifths are approxi- x→∞ x
mately seven octaves.
√ Two of the main open conjectures on prime numbers are the following:
A piano for which every half-tone has the same frequency ratio, namely 12 2, is called
15
a well-tempered piano. The reason why music is a pleasure, despite its theoretically Conjecture 4.1. There exist infinitely many twin primes, i.e., primes p for which also
unavoidable inaccuracy, is that our ear is trained to “round tones up or down” as needed. p + 2 is prime.
Conjecture 4.2 (Goldbach). Every even number greater than 2 is the sum of two primes.
4.4 Some Basic Facts About Primes * 4.4.2 Remarks on Primality Testing
4.4.1 The Density of Primes How can we test whether a given integer n is a prime? We can test whether any smaller
prime is a divisor of n. The following lemma provides a well-known √ short-cut. In a
The following fact was known already to Euclid. practical implementation, one might not have a list of primes up to n and can instead
simply try all odd numbers as possible divisors.
Theorem 4.9. There are infinitely many primes. √
Lemma 4.12. Every composite integer n has a prime divisor ≤ n.
Proof. To arrive at a contradiction, suppose Q that the set of primes is finite, say P = Proof.√If n is composite, it has a divisor a with
{p1 , . . . , pm }. Then the number n = m p i + 1 is not divisible by any of the primes √ √ 1<√ a < n. Hence n = ab for b > 1. Either
i=1 a ≤ n√ or b ≤ n since otherwise ab > n · n = n. Hence n has a divisor √ c with
p1 , . . . , pm and hence n is either itself a prime, or divisible by a prime not in {p1 , . . . , pm }. 1 < c ≤ n. Either c is prime or, by Theorem 4.6, has a prime divisor d < c ≤ n.
In either case, this contradicts the assumption that p1 , . . . , pm are the only primes.
For large integers, trial-division up to the square root is hopelessly inefficient. Let us
This is a non-constructive existence proof. We now give a constructive existence briefly discuss the algorithmic problem of testing primality.
proof for another number-theoretic fact. Primes are of great importance in cryptography. In many applications, one needs to
generate very large primes, with 1024 or even 2048 bits. In many cases, the primes must
Theorem 4.10. Gaps between primes can be arbitrarily large, i.e., for every k ∈ N there exists remain secret and it must be infeasible to guess them. They should therefore be selected
n ∈ N such that the set {n, n + 1, · · · , n + k − 1} contains no prime. uniformly at random from the set of primes in a certain interval, possibly satisfying some
further restrictions for security or operational reasons.
Proof. Let n = (k + 1)! + 2. Then for any l with 2 ≤ l ≤ k + 1, l divides (k + 1)! = n − 2 Such primes can be generated in three different ways. The first approach is to select
and hence l also divides (k + 1)! + l = n − 2 + l, ruling out n, n + 1, . . . , n + k − 1 as a random (odd) integer from the given interval (e.g. [101023 , 101024 − 1]) and to apply
possible primes. a general primality test. Primality testing is a very active research area. The record
in general primality testing is around 8.000 decimal digits and required sophisticated
Example 4.7. The largest gap between two primes below 100 is 8. Which are these techniques and very massive computing power. As a celebrated theoretical breakthrough
primes? (with probably little practical relevance), it was proved in 2002 that primality testing is
in P, i.e., there is a worst-case polynomial-time algorithm for deciding primality of an
There exists a huge body of literature on the density and distribution of primes. We integer.16
only state the most important one of them. The second approach is like the first, but instead of a primality test one performs a
probabilistic compositeness test. Such a test has two outcomes, “composite” and “pos-
Definition 4.7. The prime counting function π : R → N is defined as follows: For any real
sibly prime”. In the first case, one is certain that the number is composite, while in the
x, π(x) is the number of primes ≤ x.
16 M. Agrawal, N. Kayal, and N. Saxena, PRIMES is in P, Annals of Mathematics vol. 160, pp. 781–
15 German: wohltemperiert 793.
81 Chapter 4. Number Theory 4.5. Congruences and Modular Arithmetic 82
other case one has good chances that the number is a prime, without being certain. More for all m, i.e., the relation ≡m is reflexive (a ≡m a for all a). It is easy to verify
precisely, one can fix a (very small) probability ǫ (e.g. ǫ = 10−100 ) and then perform that this relation is also symmetric and transitive, which was already stated in
a test such that for any composite integer, the probability that the test does not output Chapter 3:
“composite” is bounded by ǫ.
A third approach is to construct a prime together with a proof of primality. As we Lemma 4.13. For any m ≥ 1, ≡m is an equivalence relation on Z.
might see later, the primality of an integer n can be proved if one knows part of the
factorization of n − 1. The implication (4.1) can be turned around and can be used to prove the
inequality of two numbers a and b:
The following lemma shows that modular congruences are compatible with
4.5.1 Modular Congruences the arithmetic operations on Z.
We consider the following motivating example: Lemma 4.14. If a ≡m b and c ≡m d, then
Example 4.8. Fermat’s famous “last theorem”, proved recently by Wiles, states
a + c ≡m b + d and ac ≡m bd.
that the equation xn + y n = z n has no solution in positive integers x, y, z for
n ≥ 3. Here we consider a similar question. Does x3 + x2 = y 4 + y + 1 have a
Proof. We only prove the first statement and leave the other proof as an exercise.
solution in integers x and y?
We have m | (a − b) and m | (c − d). Hence m also divides (a − b) + (c − d) =
The answer is “no”, and the proof is surprisingly simple: Obviously, x must be (a + c) − (b + d), which is the definition of a + c ≡m b + d.
either even or odd. In both cases, x3 + x2 is even. On the other hand, y 4 + y + 1
is odd no matter whether y is even or odd. But an even number cannot be equal Corollary 4.15. Let f (x1 , . . . , xk ) be a multi-variate polynomial in k variables with
to an odd number. integer coefficients, and let m ≥ 1. If ai ≡m bi for 1 ≤ i ≤ k, then
Here is another example whose solution requires a generalization of the f (a1 , . . . , ak ) ≡m f (b1 , . . . , bk ).
above trick.
Proof. Evaluating a polynomial can be achieved by a sequence of additions and
Example 4.9. Prove that x3 − x = y 2 + 1 has no integer solutions. multiplications. In each such step the congruence modulo m is maintained,
according to Lemma 4.14.
Definition 4.8. For a, b, m ∈ Z with m ≥ 1, we say that a is congruent to b modulo
m if m divides a − b. We write a ≡ b (mod m) or simply a ≡m b, i.e., 4.5.2 Modular Arithmetic
def
a ≡m b ⇐⇒ m | (a − b). There are m equivalence classes of the equivalence relation ≡m , namely
[0], [1], . . . , [m − 1]. Each equivalence class [a] has a natural representative
Example 4.10. We have 23 ≡7 44 and 54321 ≡10 1. Note that a ≡2 b means that Rm (a) ∈ [a] in the set
a and b are either both even or both odd. Zm := {0, . . . , m − 1}
Example 4.11. If a ≡2 b and a ≡3 b, then a ≡6 b. The general principle underly- of remainders modulo m.17
ing this example will be discussed later. In the following we are often interested only in the remainder of an integer
(e.g. the result of a computation) modulo some modulus m. Addition and mul-
The above examples 4.8 and 4.9 make use of the fact that if an equality holds tiplication modulo m can be considered as operations on the set Zm . We will be
over the integers, then it must also hold modulo 2 or, more generally, modulo interested in this structure in Chapter 5 where we will see that it is an important
any modulus m. In other words, for any a and b, example of a so-called ring.
a = b =⇒ a ≡m b (4.1) 17 Recall that Rm (a) denotes the remainder when a is divided by m.
83 Chapter 4. Number Theory 4.5. Congruences and Modular Arithmetic 84
Example 4.12. Is n = 84877 · 79683 − 28674 · 43879 even or odd? The answer Example 4.15. A similar test can be performed for m = 11. R11 (n) can be com-
is trivial and does not require the computation of n. The product of two odd puted by adding the decimal digits of n with alternating sign modulo 11. This
numbers is odd, the product of an even and an odd numbers is even, and the test, unlike that for m = 9, detects the swapping of digits.
difference of an odd and an even number is odd. Thus n is odd.
Example 4.16. The larger m, the more likely it is that a calculation error is de-
The following lemma establishes the simple connection between congruence tected. How could one implement a similar test for m = 99, how for m = 101?
modulo m and remainders modulo m. The proof is easy and left as an exercise.
(i) a ≡m Rm (a). Consider the problem of finding the solutions x for the congruence equation
(ii) a ≡m b ⇐⇒ Rm (a) = Rm (b). ax ≡m b.
The above lemma together with Lemma 4.14 implies that if in a computation Obviously, if x is a solution, then so is x + km for any k ∈ Z. Hence we can
involving addition and multiplication one is interested only in the remainder of restrict the consideration to solutions in Zm . Of special interest is the case where
the result modulo m, then one can compute remainders modulo m at any in- gcd(a, m) = 1 and b = 1.
termediate step (thus keeping the numbers small), without changing the result.
This is referred to as modular arithmetic. Lemma 4.18. The congruence equation
The multiplicative inverse of a modulo m can efficiently be computed using satisfies all the congruences. In order to prove uniqueness, observe that for two
the so-called extended Euclidean algorithm. Note that if gcd(a, m) 6= 1, then a solutions x′ and x′′ , x′ − x′′ ≡mi 0 for all i, i.e., x′ − x′′ is a multiple of all the mi
has no multiplicative inverse modulo m. and hence of lcm(m1 , . . . , mr ) = M . Thus x′ ≡M x′′ .
The Chinese Remainder Theorem has several applications. When one is in-
4.5.4 The Chinese Remainder Theorem terested in a computation modulo M , then the moduli mi can be viewed as a
We now consider a system of congruences for an integer x. coordinate system. One can project the numbers of interest modulo the mi , and
perform the computation in the r projections (which may be more efficient than
Example 4.18. Find an integer x for which x ≡3 1, x ≡4 2, and x ≡5 4. A computing directly modulo M ). If needed at the end, one can reconstruct the
solution is x = 34 as one can easily verify. This is the only solution in Z60 , but result from the projections.
by adding multiples of 60 to x one obtains further solutions.
Example 4.19. Compute R35 (21000 ). We can do this computation modulo 5 and
The following theorem, known as the Chinese Remainder Theorem (CRT), modulo 7 separately. Since 24 ≡5 1 we have 21000 ≡5 1. Since 23 ≡7 1 we have
states this for general systems of congruences. The proof of the theorem is con- 21000 ≡7 2. This yields 21000 ≡35 16 since 16 is the (unique) integer x ∈ [0, 34]
structive: it shows how a solution x can be constructed efficiently. with x ≡5 1 and x ≡7 2.
Theorem
Qr 4.19. Let m1 , m2 , . . . , mr be pairwise relatively prime integers and let M =
i=1 mi . For every list a1 , . . . , ar with 0 ≤ ai < mi for 1 ≤ i ≤ r, the system of 4.6 Application: Diffie-Hellman Key-Agreement
congruence equations
Until the 1970’s, number theory was considered one of the purest of all math-
x ≡ m1 a1 ematical disciplines in the sense of being furthest away from any useful ap-
x ≡ m2 a2 plications. However, this has changed dramatically in the 1970’s when crucial
... applications of number theory in cryptography were discovered.
x ≡ mr ar In a seminal 1976 paper19 , Diffie and Hellman proposed the revolutionary
concept of public-key cryptography. Most security protocols, and essentially all
for x has a unique solution x satisfying 0 ≤ x < M . those used on the Internet, are based on public-key cryptography. Without this
amazing and paradoxical invention, security on the Internet would be unthink-
able.
Proof. Let Mi = M/mi . Hence gcd(Mi , mi ) = 1 because every factor mk (where
k 6= i) of Mi is relatively prime to mi , and thus so is Mi . Thus there exists an Ni Consider the key distribution problem. In order to encrypt the communica-
satisfying tion between two parties, say Alice and Bob, they need a secret key known only
Mi Ni ≡mi 1. to them. How can they obtain such a key in a setting, like the Internet, where
they initially share no secret information and where they are connected only
Note that for all k 6= i we have Mi ≡mk 0 and thus by an insecure communication channel to which a potential adversary has ac-
cess? We describe the famous Diffie-Hellman protocol which allows to solve
Mi Ni ≡mk 0. this seemingly paradoxical problem.
Therefore The Diffie-Hellman protocol (see Figure 4.2), as originally proposed20 , makes
r
X use of exponentiation modulo a large prime p, for instance with 2048 bits. While
ai M i N i ≡ mk ak y = Rp (g x ) can be computed efficiently (how?), even if p, g and x are numbers
i=1 of several hundred or thousands of digits, computing x when given p, g and y
for all k. Hence the integer x defined by is generally (believed to be) computationally infeasible. This problem is known
r
! 19 W. Diffie and M.E. Hellman, New directions in cryptography, IEEE Transactions on Information
X
x = RM ai M i N i Theory, vol. 22, no. 6, pp. 644-654, 1976.
20 Since then, other versions, for instance based on elliptic curves, have been proposed.
i=1
87 Chapter 4. Number Theory 4.6. Application: Diffie-Hellman Key-Agreement 88
as (a version of) the discrete logarithm problem. The security of the Diffie-Hellman Alice Bob
protocol is based on this asymmetry in computational difficulty. Such a func- insecure channel
tion, like x 7→ Rp (g x ), is called a one-way function: it is easy to compute in one
direction but computationally very hard to invert.21 x x
A A
yA=g A yB=g B B B
The prime p and the basis g (e.g. g = 2) are public parameters, possibly
generated once and for all for all users of the system. The protocol is symmetric, xA xB
i.e., Alice and Bob perform the same operations. The exchange of the so-called yA
A
public keys yA and yB must be authenticated, but not secret.22 It is easy to see that
Alice and Bob end up with the same value kAB = kBA which they can use as yB B
a secret key for encrypting subsequent communication.23 In order to compute
kAB from yA and yB , an adversary would have to compute either xA or xB ,
which is believed to be infeasible.
xx xx
B
kAB=g B A kBA=g A B
B
A
A
Alice insecure channel Bob
select xA at random select xB at random Figure 4.2: Mechanical analog of the Diffie-Hellman protocol.
from {0, . . . , p−2} from {0, . . . , p−2}
yA := Rp (g xA ) yB := Rp (g xB ) interlocked. For the adversary, this is impossible without breaking open one of
yA the locks.
✲
yB Another famous (and more widely used) public-key cryptosystem, the so-
✛ called RSA-system invented in 1977 and named after Rivest, Shamir and Adle-
xA xB
kAB := Rp (yB ) kBA := Rp (yA ) man25 , will be discussed later. Its security is based on the (conjectured) compu-
tational difficulty of factoring large integers.
xA
kAB ≡p yB ≡p (g xB )xA ≡p g xA xB ≡p kBA
to KAB .
24 A padlock with a key corresponds to a so-called trapdoor one-way function which is not consid-
Operations with arity 1 and 2 are called unary and binary operations, respec-
tively. An operation with arity 0 is called a constant; it is a fixed element from
the set S, for instance the special element 1 in Z. In many cases, only binary
operations are actually listed explicitly,
89
91 Chapter 5. Algebra 5.2. Monoids and Groups 92
5.2.1 Neutral Elements Note that up to now, and also in the next section, we do not yet pay attention
to the fact that some operations are commutative. In a sense, commutativity is
less important than associativity.
Definition 5.3. A left [right] neutral element (or identity element) of an algebra
hS; ∗i is an element e ∈ S such that e ∗ a = a [a ∗ e = a] for all a ∈ S. If Some standard examples of associate operations are addition and multipli-
e ∗ a = a ∗ e = a for all a ∈ S, then e is simply called neutral element. cation in various structures: Z, N, Q, R, and Zm .
If the operation is called addition, then e is usually denoted as 0, and if it is Definition 5.5. A monoid is an algebra hM ; ∗, ei where ∗ is associative and e is
called multiplication, then e is usually denoted as 1. the neutral element.
Lemma 5.1. If hS; ∗i has both a left and a right neutral element, then they are equal. Some standard examples of monoids are hZ; +, 0i, hZ; ·, 1i, hQ; +, 0i, hQ; ·, 1i,
In particular hS; ∗i can have at most one neutral element. hR; +, 0i, hR; ·, 1i, hZm ; ⊕, 0i, and hZm ; ⊙, 1i.
The operation in the previous example, sequence concatenation, has a very use-
ful property: When all sequences are written one after the other, with spaces Definition 5.6. A left [right] inverse element5 of an element a in an algebra hS; ∗, ei
between sequences, it does not matter in which order one performs the concate- with neutral element e is an element b ∈ S such that b ∗ a = e [a ∗ b = e]. If
nation operations. In short, sequence concatenation is associative. b ∗ a = a ∗ b = e, then b is simply called an inverse of a.
We summarize a few facts we encountered already earlier for the special case Example 5.10. Recall that the sequences with concatenation and the empty se-
of the integers Z. The group is the right level of abstraction for describing these quence as the neutral element form a non-commutative monoid. This is not a
facts. The proofs are left as exercises. group because inverses cannot be defined (except for the empty sequence).
Lemma 5.3. For a group hG; ∗,b, ei, we have for all a, b, c ∈ G: Example 5.11. For a given structure R supporting addition and multiplication
(to be called a ring later), let R[x] denote the set of polynomials with coefficients
c
(i) (b
a) = a. in R. hZ[x]; +i, hQ[x]; +i, and hR[x]; +i are abelian groups, where + denotes
(ii) ad
∗ b = bb ∗ b
a. polynomial addition. hZ[x]; ·i, hQ[x]; ·i, and hR[x]; ·i are commutative monoids,
where · denotes polynomial multiplication. The neutral element is the polyno-
(iii) Left cancellation law: a ∗ b = a ∗ c =⇒ b = c.
mial 1. Like hZ; ·i, hR[x]; ·i is not a group, for any R.
(iv) Right cancellation law: b ∗ a = c ∗ a =⇒ b = c.
Example 5.12. Let Sn be the set of n! permutations of n elements, i.e., the set
(v) The equation a ∗ x = b has a unique solution x for any a and b. of bijections {1, . . . , n} → {1, . . . , n}. A bijection f has an inverse f −1 . Sn is a
So does the equation x ∗ a = b. subset of the set of functions {1, . . . , n} → {1, . . . , n}. It follows from the asso-
ciativity of function composition that the composition of permutations is also
5.2.4 (Non-)minimality of the Group Axioms associative. The group hSn ; ◦,−1 , idi is called the symmetric group on n elements.
Sn is non-abelian for n ≥ 3.
In mathematics, one generally wants the axioms of a theory to be minimal. One
can show that the group axioms as stated are not minimal. One can simplify Another important source of (usually non-abelian) groups are symmetries
axiom G2 by only requesting that a ∗ e = a; call this new axiom G2’. The equa- and rotations of a geometric figure, mapping the figure to itself (but permut-
tion e ∗ a = a is then implied (by all axioms). The proof of this fact is left as an ing the vertices). The neutral element is the identity mapping and the inverse
exercise. Similarly, one can simplify G3 by only requesting that a ∗ b a = e; call element is, for an axial or point symmetry, the element itself. For a rotation,
this new axiom G3’. The equation b a ∗ a = e is then implied. The proof for this is the inverse element is the inverse rotation. To form a group, the closure under
as follows: composition must be considered. For example, the composition of two axial
symmetries corresponds to a rotation by twice the angle between the axes. Such
b
a∗a = (b
a ∗ a) ∗ e (G2’ ) a group is a subset (a subgroup) of the set of permutations of the vertices.
= (b a∗b
a ∗ a) ∗ (b b
a) (G3’, i.e., def. of right inverse of b
a)
Example 5.13. Consider a square in the plane, with nodes labeled A, B, C, D.
6 Named after the Norwegian mathematician Niels Henrik Abel. Now consider operations which map the square to itself, but with the vertices
95 Chapter 5. Algebra 5.3. The Structure of Groups 96
permuted. Consider the four reflections with respect to one of the two mid- 5.3.2 Group Homomorphisms
dle parallels or one of the two diagonals. The closure under composition of
these four elements also includes the four rotations by 0◦ (the neutral element), Homomorphisms are a central concept in mathematics and also in Computer
by 90◦ , 180◦, and by 270◦ . These 8 elements (reflections and rotations) form a Science. A homomorphism is a structure-preserving function from an algebraic
group, which we denote by S✷ . It is called the symmetry group of the square. If structure into another algebraic structure. Here we only introduce homomor-
the vertices of the square are labeled A, B, C, D, then these eight geometric op- phisms of groups.
erations each corresponds to a permutation of the set {A, B, C, D}. For example,
the rotation by 90◦ corresponds to the permutation (A, B, C, D) → (B, C, D, A). Definition 5.10. For two groups hG; ∗,b, ei and hH; ⋆,e, e′ i, a function ψ : G → H
Note that the set of four rotations also form a group, actually a subgroup of the is called a group homomorphism if, for all a and b,
above described group and also a subgroup of the group of permutations on
{A, B, C, D}.7 ψ(a ∗ b) = ψ(a) ⋆ ψ(b).
Example 5.14. It is left as an exercise to figure out the symmetry group of the If ψ is a bijection from G to H, then it is called an isomorphism, and we say that
three-dimensional cube. G and H are isomorphic and write G ≃ H.
We use the symbol e for the inverse operation in the group H. The proof of
the following lemma is left as an exercise:
5.3 The Structure of Groups Lemma 5.5. A group homomorphism ψ from hG; ∗,b, ei to hH; ⋆,e, e′ i satisfies
Definition 5.9. The direct product of n groups hG1 ; ∗1 i , . . . , hGn ; ∗n i is the alge- The concept of an isomorphism is more general than for algebraic systems in
bra the strict sense, and it applies to more general algebraic structures, for instance
hG1 × · · · × Gn ; ⋆i, also to relations and hence to graphs.
where the operation ⋆ is component-wise:
Example 5.16. The group hZ6 ; ⊕i × hZ10 ; ⊕i is isomorphic to hZ2 ; ⊕i × hZ30 ; ⊕i.
(a1 , . . . , an ) ⋆ (b1 , . . . , bn ) = (a1 ∗1 b1 , . . . , an ∗n bn ). The isomorphism ψ : Z6 × Z10 → Z2 × Z30 is easily checked to be given by
ψ((a, b)) = (a′ , b′ ) where a′ ≡2 a (i.e., a′ = R2 (a)) and b′ is given by b′ ≡3 a and
b′ ≡10 b (where the Chinese remainder theorem can be applied).
Lemma 5.4. hG1 × · · ·× Gn ; ⋆i is a group, where the neutral element and the inversion
operation are component-wise in the respective groups. Example 5.17. The logarithm function is a group homomorphism from hR>0 , ·i
to hR, +i since log(a · b) = log a + log b.
Proof. Left as an exercise. We give two familiar examples of relevant homomorphisms that are not iso-
morphisms.
Example 5.15. Consider the group hZ5 ; ⊕i × hZ7 ; ⊕i. The carrier of the group is Example 5.18. If one considers the three-dimensional space R3 with vector ad-
Z5 × Z7 . The neutral element is (0, 0). If we denote the group operation by ⋆, dition, then any projection on a plane through the origin or a line through the
[
then we have (2, 6) ⋆ (4, 3) = (1, 2). Also, (2, 6) = (3, 1). It follows from the origin are homomorphic images of R3 . A special case is the projection onto an
Chinese remainder theorem that hZ5 ; ⊕i × hZ7 ; ⊕i is isomorphic to hZ35 ; ⊕i, a axis of the coordinate system, which abstracts away all but one coordinate.
concept introduced in the following subsection.
Example 5.19. Consider the set of real-valued n × n matrices. The determinant
7 We is a homomorphism (with respect to multiplication) from the set of matrices to
point out that one can consider the described operations also as bijections of the real plane,
i.e., as functions R2 → R2 . R. We have det(AB) = det(A) det(B).
97 Chapter 5. Algebra 5.3. The Structure of Groups 98
5.3.3 Subgroups Definition 5.12. Let G be a group and let a be an element of G. The order8 of
a, denoted ord(a), is the least m ≥ 1 such that am = e, if such an m exists, and
For a given algebra, for example a group or a ring (see Section 5.5), a subalgebra ord(a) is said to be infinite otherwise, written ord(a) = ∞.
is a subset that is by itself an algebra of the same type, i.e., a subalgebra is a sub-
set of an algebra closed under all operations. For groups we have specifically: By definition, ord(e) = 1. If ord(a) = 2 for some a, then a−1 = a; such an a is
called self-inverse.
Definition 5.11. A subset H ⊆ G of a group hG; ∗,b, ei is called a subgroup of G
if hH; ∗,b, ei is a group, i.e., if H is closed with respect to all operations: Example 5.23. The order of 6 in hZ20 ; ⊕, ⊖, 0i is 10. This can be seen easily since
60 = 10 · 6 is the least common multiple of 6 and 20. The order of 10 is 2, and we
(1) a ∗ b ∈ H for all a, b ∈ H,
note that 10 is self-inverse.
(2) e ∈ H, and
Example 5.24. The order of any axial symmetry in the group S✷ (see exam-
(3) b
a ∈ H for all a ∈ H.
ple 5.13) is 2, while the order of the 90◦ -rotation (and also of the 270◦-rotation)
is 4.
Example 5.20. For any group hG; ∗,b, ei, there exist two trivial subgroups: the
subset {e} and G itself. Example 5.25. The order of any integer a 6= 0 in hZ; +i is ∞.
Example 5.21. Consider the group Z12 (more precisely hZ12 ; ⊕, ⊖, 0i). The Example 5.26. Consider the group S5 of permutations on the set {1, 2, 3, 4, 5}.
following subsets are all the subgroups: {0}, {0, 6}, {0, 4, 8}, {0, 3, 6, 9}, What is the order of the permutations described by (1, 2, 3, 4, 5) → (3, 1, 2, 4, 5),
{0, 2, 4, 6, 8, 10}, and Z12 . by (1, 2, 3, 4, 5) → (1, 2, 3, 5, 4), and by (1, 2, 3, 4, 5) → (2, 3, 1, 5, 4)?
Example 5.22. The set of symmetries and rotations discussed in example 5.13, The following lemma implies that the sequence of powers of an element of a
denoted S✷ , constitutes a subgroup (with 8 elements) of the set of 24 permuta- finite group is periodic. It does not hold in every monoid (why?).
tions on 4 elements.
Lemma 5.6. In a finite group G, every element has a finite order.
Proof. Since G is finite, we must have ar = as = b for some r and s with r < s
5.3.4 The Order of Group Elements and of a Group (and some b). Then as−r = as · a−r = b · b−1 = e.
In the remainder of Section 5.3 we will use a multiplicative notation for groups,
i.e., we denote the group operation as “·” (which can also be omitted) and use Definition 5.13. For a finite group G, |G| is called the order of G.9
the corresponding multiplicative notation. But it is important to point out that
this is only a notational convention that entails no loss of generality of the kind
of group operation. In many cases (but not always) we denote the neutral ele-
ment of a multiplicatively written group as 1. The inverse of a is denoted a−1 5.3.5 Cyclic Groups
or 1/a, and a/b stands for ab−1 . Furthermore, we use the common notation for
powers of elements: For n ∈ Z, an is defined recursively: If G is a group and a ∈ G has finite order, then for any m ∈ Z we have
• a0 = e, am = aRord(a) (m) .
n n−1
• a =a·a for n ≥ 1, and
Definition 5.14. For a group G and a ∈ G, the group generated by a, denoted hai,
• an = (a−1 )|n| for n ≤ −1.
is defined as
def
It is easy to see that for all m, n ∈ Z hai = {an | n ∈ Z}.
8 German: Ordnung
am · an = am+n and (am )n = amn . 9 Note that the term “order” has two different (but related) meanings.
99 Chapter 5. Algebra 5.3. The Structure of Groups 100
It is easy to see that hai is a group, actually the smallest subgroup of a group Theorem 5.8 (Lagrange). Let G be a finite group and let H be a subgroup of G. Then
G containing the element a ∈ G. For finite groups we have the order of H divides the order of G, i.e., |H| divides |G|.
def The following corollaries are direct applications of Lagrange’s theorem.
hai = {e, a, a2 , . . . , aord(a)−1 }.
Corollary 5.9. For a finite group G, the order of every elements divides the group order,
Definition 5.15. A group G = hgi generated by an element g ∈ G is called cyclic, i.e., ord(a) divides |G| for every a ∈ G.
and g is called a generator of G.
Proof. hai is a subgroup of G of order ord(a), which according to Theorem 5.8
Being cyclic is a special property of a group. Not all groups are cyclic! A must divide |G|.
cyclic group can have many generators. In particular, if g is a generator, then so
is g −1 .
Example 5.27. The group hZn ; ⊕i is cyclic for every n, where 1 is a generator. Corollary 5.10. Let G be a finite group. Then a|G| = e for every a ∈ G.
The generators of hZn ; ⊕i are all g ∈ Zn for which gcd(g, n) = 1, as the reader
can prove as an exercise.
Proof. We have |G| = k · ord(a) for some k. Hence
Example 5.28. The additive group of the integers, hZ; +, −, 0i, is an infinite
cyclic group generated by 1. The only other generator is −1. k
a|G| = ak·ord(a) = aord(a) = ek = e.
Theorem 5.7. A cyclic group of order n is isomorphic to hZn ; ⊕i (and hence abelian).
Corollary 5.11. Every group of prime order10 is cyclic, and in such a group every
element except the neutral element is a generator.
In fact, we use hZn ; ⊕i as our standard notation of a cyclic group of order n.
Proof. Let G = hgi be a cyclic group of order n (with neutral element e). The Proof. Let |G| = p with p prime. For any a, the order of the subgroup hai divides
bijection Zn → G : i 7→ g i is a group isomorphism since i ⊕ j 7→ g i+j = p. Thus either ord(a) = 1 or ord(a) = p. In the first case, a = e and in the latter
gi ∗ gj . case G = hai.
Definition 5.17. The Euler function ϕ : Z+ → Z+ is defined as the cardinality of Example 5.31. In Z∗11 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} we have 4 ⊙ 6 = 2 and 7−1 = 8
Z∗m : since 7 ⊙ 8 = 1.
ϕ(m) = |Z∗m |.
Now we obtain the following simple but powerful corollary to Theorem 5.8.
Example 5.29. Z∗18 = {1, 5, 7, 11, 13, 17}. Hence ϕ(18) = 6.
Corollary 5.14 (Fermat, Euler). For all m ≥ 2 and all a with gcd(a, m) = 1,
If p is a prime, then Z∗p = {1, . . . , p − 1} = Zp \ {0}, and ϕ(p) = p − 1.
aϕ(m) ≡m 1.
Qr
Lemma 5.12. If the prime factorization of m is m = i=1 pei i , then11 In particular, for every prime p and every a not divisible by p,
r
Y ap−1 ≡p 1.
ϕ(m) = (pi − 1)piei −1 .
i=1
Proof. This follows from Corollary 5.10 for the group Z∗m of order ϕ(m).
Proof. For a prime p and e ≥ 1 we have The special case for primes was known already to Fermat.12 The general
case was proved by Euler, actually before the concept of a group was explicitly
ϕ(pe ) = pe−1 (p − 1)
introduced in mathematics.
since exactly every pth integer in Zpe contains a factor p and hence ϕ(pe ) = We state the following theorem about the structure of Z∗m without proof. Of
pe−1 (p − 1) elements contain no factor p. For a ∈ Zm we have gcd(a, m) = 1 if particular importance and interest is the fact that Z∗p is cyclic for every prime p.
and only if gcd(a, pei i ) = 1 for i = 1, . . . , r. Since the numbers pei i are pairwise
Theorem 5.15. The group Z∗m is cyclic if and only if m = 2, m = 4, m = pe , or
relatively prime, the Chinese remainder theorem implies that there is a one-
m = 2pe , where p is an odd prime and e ≥ 1.
to-one correspondence between elements of Zm and lists (a1 , . . . , ar ) with ai ∈
Zpei i . Hence, using the above, there is also a one-to-one correspondence between Example 5.32. The group Z∗19 is cyclic, and 2 is a generator. The powers of 2
Qr
elements of Z∗m and lists (a1 , . . . , ar ) with ai ∈ Z∗pei . There are i=1 (pi − 1)piei −1 are 2, 4, 8, 16, 13, 7, 14, 9, 18, 17, 15, 11, 3, 6, 12, 5, 10, 1. The other generators are
i
such lists. 3, 10, 13, 14, and 15.
Z. Moreover, 1 is a neutral element and inverses exist (see Section 4.5.3). Thus propriate. To test whether a number n is prime one chooses a base a and checks whether an−1 ≡n 1.
hZ∗m ; ⊙,−1 , 1i is a group. If the condition is violated, then n is clearly composite, otherwise n could be a prime. Unfortunately,
it is not guaranteed to be a prime. In fact, there are composite integers n for which an−1 ≡n 1 for
all a with gcd(a, n) = 1. (Can you find such an n?) For more sophisticated versions of such a
Example 5.30. In Z∗18 = {1, 5, 7, 11, 13, 17} we have 5 ⊙ 13 = 11 and 11−1 = 5 probabilistic test one can prove that for every composite n, the fraction of test values for which the
since 11 ⊙ 5 = 1 (i.e., R18 (11 · 5) = 1). corresponding condition is satisfied is at most 1/4. Thus, by repeating the test sufficiently many
times, the confidence that n is prime can be increased beyond any doubt. This is useful in practice,
11 Alternatively,
Y 1 but it does not provide a proof that n is prime.
ϕ(m) could be defined as ϕ(m) = m · 1− . 13 R. L. Rivest, A. Shamir, and L. Adleman, A method for obtaining digital signatures and public-
p
p|m
p prime key cryptosystems, Communications of the ACM, Vol. 21, No. 2, pp. 120–126, 1978.
103 Chapter 5. Algebra 5.4. Application: RSA Public-Key Encryption 104
they can authenticate each other’s public keys (respectively the Diffie-Hellman
values). Moreover, the RSA system can be used as a digital signature scheme
(see below). RSA was the first cryptographic system offering this important Alice insecure channel Bob
functionality.
Generate
primes p and q
5.4.1 e-th Roots in a Group n=p·q
f = (p−1)(q−1)
To understand the RSA system, all we need is the following simple theorem
select e
which is a consequence of Lagrange’s theorem (Theorem 5.8). n, e plaintext
d ≡f e−1 ✲ m ∈ {1, . . . , n − 1}
Theorem 5.16. Let G be some finite group (multiplicatively written), and let e ∈ Z be
relatively prime to |G| (i.e. gcd(e, |G|) = 1). The function x 7→ xe is a bijection and y ciphertext
the (unique) e-th root of y ∈ G, namely x ∈ G satisfying xe = y, is m = Rn (y d ) ✛ y = Rn (me )
x = yd ,
When |G| is known, then d can be computed from ed ≡|G| 1 by using the y 7→ m = Rn (y d ),
extended Euclidean algorithm No general method is known for computing e-th where d can be computed according to
roots in a group G without knowing its order. This can be exploited to define a
public-key cryptosystem. ed ≡(p−1)(q−1) 1.
m is an encryption key (typically a short-term session key) for a conventional 5.5 Rings and Fields
cryptosystem which is used to encrypt the actual application data (e.g. of a TLS
session). We now consider algebraic systems with two binary operations, usually called
addition and multiplication.
5.4.3 On the Security of RSA *
Let us have a brief look at the security of the RSA public-key system.17 It is not known 5.5.1 Definition of a Ring
whether computing e-th roots modulo n (when gcd(e, ϕ(n)) = 1) is easier than factoring
n, but it is widely believed that the two problems are computationally equivalent.18 Fac-
toring large integers is believed to be computationally infeasible. If no significant break- Definition 5.18. A ring hR; +, −, 0, ·, 1i is an algebra for which
through in factoring is achieved and if processor speeds keep improving at the same rate (i) hR; +, −, 0i is a commutative group.
as they are now (using the so-called Moore’s law), a modulus with 2048 bits appears to (ii) hR; ·, 1i is a monoid.
be secure for the next 15 years, and larger moduli (e.g. 8192 bits) are secure for a very (iii) a(b + c) = (ab) + (ac) and (b + c)a = (ba) + (ca) for all a, b, c ∈ R
long time.
(left and right distributive laws).
Obviously, the system is insecure unless Bob can make sure he obtains the correct
A ring is called commutative if multiplication is commutative (ab = ba).20
public key from Alice rather than a public key generated by an adversary and posted in
the name of Alice. In other words, the public key must be sent from Alice to Bob via an
authenticated channel. This is usually achieved (indirectly) using a so-called public-key Example 5.33. Z, Q, R, and C are (commutative) rings.
certificate signed by a trusted certification authority. One also uses the term public-key
infrastructure (PKI). Explaining these concepts is beyond the scope of this course. Example 5.34. hZm ; ⊕, ⊖, 0, ⊙, 1i is a commutative ring. Since hZm ; ⊕, ⊖, 0i is
It is important to point out that for a public-key system to be secure, the message an abelian group and hZm ; ⊙, 1i is a monoid, it only remains to check the dis-
must be randomized in an appropriate manner. Otherwise, when given an encrypted tributive law, which is inherited from the distributive law for the integers.
message, an adversary can check plaintext messages by encrypting them and comparing
them with the given encrypted message. If the message space is small (e.g. a bit), then We list some simple facts about rings.
this would allow to efficiently break the system.
Lemma 5.17. For any ring hR; +, −, 0, ·, 1i, and for all a, b ∈ R,
where || denotes concatenation and h is a suitable function introducing redun- 0 = −(a0) + a0 (0 = −b + b for all b ∈ R, e.g. for b = a0)
dancy into the message and the string z is naturally understood as an element = −(a0) + a(0 + 0) (0 + 0 = 0)
of Zn .19 A signature can be verified by raising it to the e-th power modulo n and
= −(a0) + (a0 + a0) (distributive law)
checking that it is of the correct form m||h(m). The message is recovered from
the signature. = (−(a0) + a0) + a0 (associativity of +)
17 But = 0 + a0 (−b + b = 0 for all b ∈ R )
note that a cryptographic security analysis is much more involved.
18 In fact, for a generic model of computation, this equivalence was proved in: D. Aggarwal and U. = a0 (0 + b = b for all b ∈ R)
Maurer, Breaking RSA generically is equivalent to factoring, IEEE Transactions on Information Theory,
vol. 62, pp. 6251–6259, 2016. 20 One can show (as an exercise) that ring addition must be commutative, i.e., commutativity of
19 This can be a so-called cryptographic hash function. Without such additional redundancy, every addition follows from the remaining ring axioms. The stated ring axioms are hence not minimal.
s ∈ Z∗n would be a legitimate signature, and hence forging a signature would be trivial. The word “commutative” in (i) could be dropped.
107 Chapter 5. Algebra 5.5. Rings and Fields 108
Example 5.36. The units of Z are −1 and 1: Z∗ = {−1, 1}. As mentioned in Section 4.2.3, the concept of a greatest common divisor not
only applies to integers, but to any commutative ring:
Example 5.37. The units of R are all non-zero elements of R: R∗ = R \ {0}.
Definition 5.22. For ring elements a and b (not both 0), a ring element d is called
Example 5.38. The ring of Gaussian integers (see Example 4.5) contains four a greatest common divisor of a and b if d divides both a and b and if every common
units: 1, i, −1, and −i. For example, the inverse of i is −i. divisor of a and b divides d, i.e., if
Example 5.39. The set of units of Zm is Z∗m (Definition 5.16).24
d | a ∧ d | b ∧ ∀c (c | a ∧ c | b) → c | d .
Lemma 5.18. For a ring R, R∗ is a multiplicative group (the group of units of R).
5.5.4 Zerodivisors and Integral Domains
Proof. We need to show that R∗ is closed under multiplication, i.e., that for Definition 5.23. An element a 6= 0 of a commutative ring R is called a zerodivi-
u ∈ R∗ and v ∈ R∗ , we also have uv ∈ R∗ , which means that uv has an in- sor28 if ab = 0 for some b 6= 0 in R.
verse. The inverse of uv is v −1 u−1 since (uv)(v −1 u−1 ) = uvv −1 u−1 = uu−1 = 1.
R∗ also contains the neutral element 1 (since 1 has an inverse). Moreover, the
associativity of multiplication in R∗ is inherited from the associativity of multi- Definition 5.24. An integral domain29 is a (nontrivial30
) commutative ring with-
plication in R (since elements of R∗ are also elements of R and the multiplication out zerodivisors: ∀a ∀b ab = 0 → a = 0 ∨ b = 0 .
operation is the same).
25 Recall that this definition was already stated as Definition 4.1 for the special case of integers.
21 This 26 German: Teiler
also follows from commutativity of multiplication, but the proof shows that the statement
holds even for non-commutative rings. 27 German: Vielfaches
22 German: Einheit 28 German: Nullteiler
23 The inverse, if it exists, is unique. 29 German: Integritätsbereich
24 In fact, we now see the justification for the notation Z∗ already introduced in Definition 5.16. 30 i.e., 1 6= 0
m
109 Chapter 5. Algebra 5.5. Rings and Fields 110
Example 5.40. Z, Q, R, and C are integral domains. where here and in the following coefficients with index greater than the degree
are understood to be 0. The product of a(x) and b(x) is defined as33
Example 5.41. Zm is not an integral domain if m is not a prime. Any element of ! !
′ ′
Zm not relatively prime to m is a zerodivisor. d+d
X X i d+d
X X
i
a(x)b(x) = ak bi−k x = au bv xi
Lemma 5.20. In an integral domain, if a | b, then c with b = ac is unique (and is i=0 k=0 i=0 u+v=i
denoted by c = ab or c = b/a and called quotient).31 = d+d′
ad b d ′ x + · · · + (a0 b2 + a1 b1 + a2 b0 )x2 + (a0 b1 + a1 b0 )x + a0 b0 .
Pi P
Proof. Suppose that b = ac = ac′ for some c and c′ . Then The i-th coefficient of a(x)b(x) is k=0 ak bi−k = u+v=i au bv , where the sum
is over all pairs (u, v) for which u + v = i as well as u ≥ 0 and v ≥ 0.
0 = ac + (−(ac′ )) = a(c + (−c′ )) The degree of the product of polynomials over a ring R is, by definition, at
most the sum of the degrees. It is equal to the sum if R is an integral domain,
and thus, because a 6= 0 and there are no zero-divisors, we must have c+(−c′ ) = which implies that the highest coefficient is non-zero: ad bd′ 6= 0 if ad 6= 0 and
0 and hence c = c′ . bd′ 6= 0.
Example 5.42. Consider the ring Z7 and let a(x) = 2x2 +3x+1 and b(x) = 5x+6.
5.5.5 Polynomial Rings Then
a(x) + b(x) = 2x2 + (3 + 5)x + (1 + 6) = 2x2 + x
Definition 5.25. A polynomial a(x) over a commutative ring R in the indetermi- and
nate x is a formal expression of the form a(x)b(x) = (2 · 5)x3 + (3 · 5 + 2 · 6)x2 + (1 · 5 + 3 · 6)x + 1 · 6 = 3x3 + 6x2 + 2x + 6.
d
X
a(x) = ad xd + ad−1 xd−1 + · · · + a1 x + a0 = ai xi . Theorem 5.21. For any commutative ring R, R[x] is a commutative ring.
i=0
for some non-negative integer d, with ai ∈ R. The degree of a(x), denoted Proof. We need to prove that the conditions of Definition 5.18 (i.e., the ring ax-
deg(a(x)), is the greatest i for which ai 6= 0. The special polynomial 0 (i.e., ioms) are satisfied for R[x], assuming they are satisfied for R. We first observe
all the ai are 0) is defined to have degree “minus infinity”.32 Let R[x] denote the that since multiplication in R is commutative, so is multiplication in R[x].
set of polynomials (in x) over R.
Condition (i) requires that R[x] is an abelian group with respect to (poly-
Actually, it is mathematically better (but less common) to think of a poly- nomial) addition. This is obvious from the definition of addition. Associativ-
nomial simply as a finite list (a0 , a1 , . . . , ad−1 , ad ) of elements of R. There is no ity and commutativity of (polynomial) addition are inherited from associativ-
need to think of a variable x which suggests that it can be understood as a func- ity and commutativity of addition in R (because R is a ring). The neutral el-
tion R → R. A polynomial can, but need not, be considered as such a function ement is the polynomial 0, and the inverse in the group (i.e., the negative) of
P P
(see Section 5.7). a(x) = di=0 ai xi is −a(x) = di=0 (−ai )xi .
Addition and multiplication in R[x] are defined as usual. Consider polyno- Condition (ii) requires that R[x] is a monoid with respect to (polynomial)
Pd Pd′ multiplication. The polynomial 1 is the neutral element, which is easy to see.
mials a(x) = i=0 ai xi of degree d and b(x) = i=0 bi xi of degree d′ . The sum
of a(x) and b(x) is a polynomial of degree at most max(d, d′ ) and is defined as That multiplication is associative can be seen as follows. Let a(x) and b(x) as
P ′′
above, and c(x) = di=0 ci xi . Using the above definition of a(x)b(x), we have
max(d,d′ )
X ′
+d′′
a(x) + b(x) = (ai + bi ) xi , d+d
X Xi X
a(x)b(x) c(x) = au bv ci−j xi
i=0
i=0 j=0 u+v=j
31 Note that the terms ab (or b/a) are defined only if a | b.
32 The interpretation of “minus infinity” is that it is a quantity which remains unchanged when an
Pd+d′
33 Note that, for example, the highest coefficient
k=0 ak bi−k is in the formula defined as a sum
arbitrary integer is added to it. of d + d′ + 1 terms, but all but one of them (namely for k = d) are zero.
111 Chapter 5. Algebra 5.5. Rings and Fields 112
d+d ′
+d′′
!
X X Fields are of crucial importance because in a field one can not only add,
= (au bv )cw xi . subtract, and multiply, but one can also divide by any nonzero element. This
i=0 u+v+w=i is the abstraction underlying many algorithms like those for solving systems of
linear equations (e.g. by Gaussian elimination) or for polynomial interpolation.
If one computes a(x) b(x)c(x) , one arrives at the same expression, by mak-
Also, a vector space, a crucial concept in mathematics, is defined over a field,
ing use of associativity of multiplication in R, i.e., the fact that (au bv )cw = the so-called base field. Vector spaces over R are just a special case.
au (bv cw ) = au bv cw .
Condition (iii), the distributive law, can also be shown to follow from the Example 5.45. Solve the following system of linear equations over Z11 :
distributive law holding for R. 5x ⊕ 2y = 4
Lemma 5.22. 2x ⊕ 7y = 9
(i) If D is an integral domain, then so is D[x]. Solution: Eliminate x by adding 2 times the first and ⊖5 = 6 times the second
∗ ∗ equation, resulting in
(ii) The units of D[x] are the constant polynomials that are units of D: D[x] = D .
(2 ⊙ 5 ⊕ 6 ⊙ 2) x + (2 ⊙ 2 ⊕ 6 ⊙ 7) y = 2 ⊙ 4 ⊕ 6 ⊙ 9,
Proof. Left as an exercise. | {z } | {z } | {z }
=0 =2 =7
Example 5.43. Lemma 5.22 implies that for an integral domain D, the set D[x][y] which is equivalent to 2y = 7. Thus y = 2 −1
⊙ 7 = 6 ⊙ 7 = 9. This yields
of polynomials in y with coefficients in D[x], is also an integral domain. One can
also view the elements of D[x][y] as polynomials in two indeterminates, denoted x = 2−1 ⊙ (9 ⊖ 7 ⊙ y) = 6 ⊙ (9 ⊕ 4 ⊙ 9) = 6 ⊙ 1 = 6.
D[x, y].
Later we will describe a few applications of finite fields. Here we give an-
other example of a finite field.
5.5.6 Fields
Example 5.46. We describe a field with 4 elements, F = {0, 1, A, B}, by giving
the function tables of addition and multiplication:
Definition 5.26. A field34 is a nontrivial commutative ring F in which every
nonzero element is a unit, i.e., F ∗ = F \ {0}.
+ 0 1 A B · 0 1 A B
In other words, a ring F is a field if and only if hF \ {0}; ·,−1 , 1i is an abelian 0 0 1 A B 0 0 0 0 0
group. 1 1 0 B A 1 0 1 A B
A A B 0 1 A 0 A B 1
Example 5.44. Q, R, and C are fields, but Z and R[x] (for any ring R) are not B B A 1 0 B 0 B 1 A
fields.
This field is not isomorphic to the ring Z4 , which is not a field. We explain the
Theorem 5.23. Zp is a field if and only if p is prime. construction of this 4-element field in Section 5.8.2.
Theorem 5.24. A field is an integral domain.
Proof. This follows from our earlier analysis of Z∗p , namely that Zp \ {0} is a Proof. In a field, every non-zero element is a unit, and in an integral domain,
multiplicative group if and only if p is prime. every non-zero element must not be a zero-divisor. It hence suffices to show
that in any commutative ring, a unit u ∈ R is not a zerodivisor. To arrive at a
In the following we denote the field with p elements by GF(p) rather than contradiction, assume that uv = 0 for some v. Then we must have v = 0 since
Zp . As explained later, “GF” stands for Galois field. Galois discovered finite
fields around 1830. v = 1v = u−1 uv = u−1 0 = 0,
34 German: Körper and hence u is not a zerodivisor.
113 Chapter 5. Algebra 5.6. Polynomials over a Field 114
5.6 Polynomials over a Field It follows immediately from the definition (and from the fact that the degrees
are added when polynomials are multiplied) that every polynomial of degree 1
Recall Definition 5.25 of a polynomial over a ring R. As mentioned several is irreducible. Moreover, a polynomial of degree 2 is either irreducible or the
times, the set R[x] of polynomials over R is a ring with respect to the standard product of two polynomials of degree 1. A polynomial of degree 3 is either
polynomial addition and multiplication. The neutral elements of addition and irreducible or it has at least one factor of degree 1. Similarly, a polynomial of
multiplication are the polynomials 0 and 1, respectively. degree 4 is either irreducible, has a factor of degree 1, or has an irreducible
factor of degree 2.
Polynomials over a field F are of special interest, for reasons to become clear.
Namely, they have properties in common with the integers, Z. Irreducibility of a polynomial of degree d can be checked by testing all irre-
ducible polynomials of degree ≤ d/2 as possible divisors. Actually, it suffices to
test only the monic polynomials because one could always multiply a divisor by
5.6.1 Factorization and Irreducible Polynomials a constant, for example the inverse of the highest coefficient. This irreducibil-
ity test is very similar to the primality test which checks all divisors up to the
For a, b ∈ Z, if b divides a, then also −b divides a. The analogy for polynomials square root of the number to be tested.
is as follows. If b(x) divides a(x), then so does v · b(x)for any nonzero v ∈ F be-
cause if a(x) = b(x) · c(x), then a(x) = vb(x) · v −1 c(x) . Among the polynomials Example 5.49. In GF(5)[x], x2 + 2 is irreducible since (as one can check) x + α
vb(x) (for v ∈ F ), which are in a certain sense associated to each other, there is does not divide x2 + 2, for all α ∈ GF(5).
a distinguished one, namely that with leading coefficient 1. This is analogous
to b and −b being associated in Z (see Section 5.6.3) and the positive one being
Example 5.50. In GF(7)[x], x2 + 4x + 6 is irreducible since x + α does not divide
distinguished.
x2 + 4x + 6, for all α ∈ GF(7).
Definition 5.27. A polynomial a(x) ∈ F [x] is called monic35 if the leading coef-
ficient is 1. Example 5.51. In GF(2)[x], x2 + x + 1 is irreducible because neither x nor x + 1
divides x2 + x + 1. It is easy to see that this is the only irreducible polynomial
Example 5.47. In GF(5)[x], x + 2 divides x2 + 1 since x2 + 1 = (x + 2)(x + 3). of degree 2 over GF(2). There are two irreducible polynomials of degree 3 over
Also, 2x + 4 divides x2 + 1 since x2 + 1 = (2x + 4)(3x + 4). More generally, GF(2), namely x3 + x + 1 and x3 + x2 + 1. Moreover, there are three irreducible
v · (x + 2) divides x2 + 1 for any v ∈ GF(5) because x2 + 1 = v(x + 2) · v −1 (x + 3). polynomials of degree 4 over GF(2), which the reader can find as an exercise.
Let F be a field. The ring F [x] has strong similarities with the integers Z (see (x4 + x3 + x2 + 1) : (x2 + 1) = x2 + x with remainder x + 1.
Section 5.6.3). Both these integral domains have the special property that one
can divide one element a by another element b 6= 0, resulting in a quotient q Note that in GF(2), −1 = 1. For example, −x = x and −x2 = x2
and a remainder r which are unique when r is required to be “smaller” than
the divisor. In case of the integers, the “size” of b ∈ Z is given by the absolute 5.6.3 Analogies Between Z and F [x], Euclidean Domains *
value |b|, and the “size” of a polynomial b(x) ∈ F [x] can be defined as its degree
deg(b(x)). In this section we describe the abstraction underlying both Z and F [x].
The units in Z are 1 and −1 and the units in F [x] are the non-zero constant polyno-
Proof sketch. We first prove the existence of q(x) and r(x) and then the unique-
mials (of degree 0). In Z, a and −a are associates.
ness. If deg(b(x)) > deg(a(x)), then q(x) = 0 and r(x) = a(x). We thus assume
that deg(b(x)) ≤ deg(a(x)). Let a(x) = am xm + · · · and b(x) = bn xn + · · · with Example 5.56. In Z, 6 and −6 are associates. In GF(5)[x], x2 + 2x + 3, 2x2 + 4x + 1,
n ≤ m, where “· · ·” stands for lower order terms. The first step of polynomial 3x2 + x + 4, and 4x2 + 3x + 2 are associates.
division consists of subtracting am b−1
n b(x)x
m−n
from a(x), resulting in a poly-
nomial of degree at most m − 1.36 Continuing polynomial division finally yields For a ∈ D one can define one associate to be distinguished. For Z the distinguished
q(x) and r(x), where deg(r(x)) < deg(b(x)) since otherwise one could still sub- associate of a is |a|, and for a(x) ∈ F [x] the distinguished associate of a(x) is the monic
polynomial associated with a(x). If we consider only the distinguished associates of
tract a multiple of b(x).
irreducible elements, then for Z we arrive at the usual notion of prime numbers.39
To prove the uniqueness, suppose that We point out that the association relation is closely related to divisibility. The proof
′ ′ of the following lemma is left as an exercise.
a(x) = b(x)q(x) + r(x) = b(x)q (x) + r (x),
Lemma 5.26. a ∼ b ⇐⇒ a | b ∧ b | a.
where deg(r(x)) < deg(b(x)) and deg(r′ (x)) < deg(b(x)). Then
There is one more crucial property shared by both integral domains Z and F [x] (for
b(x)[q(x) − q ′ (x)] = r′ (x) − r(x). any field F ), described in the following abstract definition.
Since deg(r′ (x) − r(x)) < deg(b(x)), this is possible only if q(x) − q ′ (x) = 0, i.e., Definition 5.32. A Euclidean domain is an integral domain D together with a so-called
q(x) = q ′ (x), which also implies r′ (x) = r(x).37 degree function d : D \ {0} → N such that
(i) For every a and b 6= 0 in D there exist q and r such that a = bq + r and d(r) < d(b)
In analogy to the notation Rm (a), we will denote the remainder r(x) of the
or r = 0.
above theorem by Rb(x) (a(x)).
(ii) For all nonzero a and b in D, d(a) ≤ d(ab).
Example 5.54. Let F be the field GF(7) and let a(x) = x3 + 2x2 + 5x + 4 and √
b(x) = 2x2 + x + 1. Then q(x) = 4x + 6 and r(x) = 2x + 5 since Example 5.57. The Gaussian√ integers Z[ −1] discussed earlier are a Euclidean domain
where the degree of a + bi is a2 + b2 , i.e., the absolute value (of complex numbers).
(x3 + 2x2 + 5x + 4) = (2x2 + x + 1) · (4x + 6) + (2x + 5),
One can prove that in a Euclidean domain, the greatest (according to the degree func-
36 Note that here it is important that F is a field since otherwise the existence of b−1
n is not guar- tion) common divisor is well-defined, up to taking associates, i.e., up to multiplication
anteed.
37 Note that here we have made use of the fact that deg(u(x)v(x)) = deg(u(x)) + deg(v(x)). We 38 In
other words, p is divisible only by units and associates of p.
point out that this only holds in an integral domain. (Why?) Recall that a field is an integral domain 39 There is a notion of a prime element of a ring,
which is different from the notion of an irreducible
(Theorem 5.24). element, but for the integers Z the two concepts coincide.
117 Chapter 5. Algebra 5.7. Polynomials as Functions 118
by a unit. The condition d(r) < d(b) guarantees that the gcd can be computed in the Proof. (=⇒) Assume that α is a root, i.e., a(α) = 0. Then, according to Theo-
well-known manner by continuous division. This procedure terminates because d(r) rem 5.25, we can write a(x) as
decreases monotonically in each division step.
The following theorem can be proved in a manner analogous to the proof of the a(x) = (x − α)q(x) + r(x),
unique factorization theorem for Z. One step is to show that a Euclidean domain is a
principle ideal domain. where deg(r(x)) < deg(x − α) = 1, i.e., r(x) is a constant r, where
Theorem 5.27. In a Euclidean domain every element can be factored uniquely (up to taking r = a(x) − (x − α)q(x).
associates) into irreducible elements.
Setting x = α in the above equation gives
• If c(x) = a(x) · b(x), then c(α) = a(α) · b(α) for any α. Theorem 5.31. For a field F , a nonzero42 polynomial a(x) ∈ F [x] of degree d has at
most d roots.
5.7.2 Roots Proof. To arrive at a contradiction, suppose a(x) has degree d but e > d
Qthat
e
roots, say α1 , . . . , αe . Then the polynomial i=1 (x − αi ) divides a(x). Since this
Definition 5.33. Let a(x) ∈ R[x]. An element α ∈ R for which a(α) = 0 is called is a polynomial of degree e, a(x) has degree at least e, and hence more than d,
a root40 of a(x). which is a contradiction.
Example 5.59. The polynomial x3 − 7x + 6 in R[x] has 3 roots: −3, 1, and 2. The
polynomial x2 + 1 in R[x] has no root. The polynomial (x3 + 2x2 + 5x + 2) in 5.7.3 Polynomial Interpolation
GF(7)[x] has 2 as its only root. The polynomial (x4 + x3 + x + 1) in GF(2)[x] has
It is well-known that a polynomial of degree d over R can be interpolated from
the root 1.
any d + 1 values. Since the proof requires only the properties of a field (rather
than the special properties of R), this interpolation property holds for polyno-
Lemma 5.29. For a field F , α ∈ F is a root of a(x) if and only if x − α divides a(x). mials over any field F . This fact is of crucial importance in many applications.
41 Note that this statement is not true for polynomials of degree ≥ 4.
40 German: Nullstelle oder Wurzel 42 Note that every α ∈ F is a root of the polynomial 0.
119 Chapter 5. Algebra 5.8. Finite Fields 120
Lemma 5.32. A polynomial a(x) ∈ F [x] of degree at most d is uniquely determined by The proof of the following lemma is analogous to the proof that congruence
any d+1 values of a(x), i.e., by a(α1 ), . . . , a(αd+1 ) for any distinct α1 , . . . , αd+1 ∈ F . modulo m is an equivalence relation on Z.
Lemma 5.33. Congruence modulo m(x) is an equivalence relation on F [x], and each
Proof. Let βi = a(αi ) for i = 1, . . . , d + 1. Then a(x) is given by Lagrange’s equivalence class has a unique representative of degree less than deg(m(x)).
interpolation formula: d+1 Example 5.60. Consider R[x] or Q[x]. We have, for example,
X
a(x) = βi ui (x), 5x3 − 2x + 1 ≡3x2 +2 8x3 + 1 ≡3x2 +2 − 16
3 x+1
i=1
as one can easily check. Actually, the remainder when 5x3 − 2x + 1 is divided
where the polynomial ui (x) is given by
by 3x2 + 2 is − 16
3 x + 1.
(x − α1 ) · · · (x − αi−1 )(x − αi+1 ) · · · (x − αd+1 ) Example 5.61. Consider GF(2)[x]. Example 5.55 can be rephrased as Rx2 +1 (x4 +
ui (x) = .
(αi − α1 ) · · · (αi − αi−1 )(αi − αi+1 ) · · · (αi − αd+1 ) x3 + x2 + 1) = x + 1.
Note that for ui (x) to be well-defined, all constant terms αi − αj in the denomi-
nator must be invertible. This is guaranteed if F is a field since αi − αj 6= 0 for Definition 5.34. Let m(x) be a polynomial of degree d over F . Then
i 6= j. Note also that the denominator is simply a constant and hence ui (x) is in- def
deed a polynomial of degree d. It is easy to verify that ui (αi ) = 1 and ui (αj ) = 0 F [x]m(x) = a(x) ∈ F [x] deg(a(x)) < d .
P
for j 6= i. Thus the polynomials a(x) and d+1 i=1 βi ui (x) agree when evaluated at We state a simple fact about the cardinality of F [x]m(x) when F is finite.
any αi , for all i. We note that a(x) has degree at most d.43
It remains to prove the uniqueness. Suppose there is another polynomial Lemma 5.34. Let F be a finite field with q elements and let m(x) be a polynomial of
a′ (x) of degree at most d such that βi = a′ (αi ) for i = 1, . . . , d + 1. Then a(x) − degree d over F . Then |F [x]m(x) | = q d .
a′ (x) is also a polynomial of degree at most d, which (according to Theorem 5.31)
Proof. We have
can have at most d roots, unless it is 0. But a(x) − a′ (x) has indeed the d + 1 roots
α1 , . . . , αd+1 . Thus it must be the 0-polynomial, which implies a(x) = a′ (x). F [x]m(x) = ad−1 xd−1 + · · · + a1 x + a0 a0 , . . . , ad−1 ∈ F .
There are q d choices for a0 , . . . , ad−1 .
5.8 Finite Fields F [x]m(x) is derived from F [x] in close analogy to how the ring Zm is derived
from the ring Z.
So far we have seen the finite field GF (p), where p is prime. In this section we
discuss all other finite fields. Lemma 5.35. F [x]m(x) is a ring with respect to addition and multiplication mod-
ulo m(x).44
5.8.1 The Ring F [x]m(x)
Proof. F [x]m(x) is a group with respect to polynomial addition.45 The neutral
We continue to explore the analogies between the rings Z and F [x]. In the same element is the polynomial 0 and the negative of a(x) ∈ F [x]m(x) is −a(x). Asso-
way as we can compute in the integers Z modulo an integer m, yielding the ring ciativity is inherited from F [x].
hZm ; ⊕, ⊖, 0, ⊙, 1i, we can also compute in F [x] modulo a polynomial m(x). Let 44 It is important to point out that we are considering three algebraic systems, namely F , F [x],
Rm(x) (a(x)) denote the (unique) remainder when a(x) is divided by m(x). The and F [x]m(x) . Each system has an addition and a multiplication operation, and we use the same
concept of congruence modulo m(x) is defined like congruence modulo m. For symbols “+” and “·” in each case, letting the context decide which one we mean. This should cause
a(x), b(x) ∈ F [x], no confusion. The alternative would have been to always use different symbols, but this would
be too cumbersome. Note that, as mentioned above, addition (but not multiplication) in F [x] and
def F [x]m(x) are identical.
a(x) ≡m(x) b(x) ⇐⇒ m(x) | a(x) − b(x) . 45 Note that the sum of two polynomials is never reduced modulo m(x) because the degree of the
sum is at most the maximum of the two degrees. In other words, a(x)+ b(x) in F [x] and a(x)+ b(x)
43 The degree can be smaller than d. in F [x]m(x) are the same operation when restricted to polynomials of degree less than deg(m(x)).
121 Chapter 5. Algebra 5.8. Finite Fields 122
(1811–1832).48 In his honor, finite fields are called Galois fields. A field with q elements is
+ 0 1 x x+1 x2 x2 + 1 x2 + x x2 + x + 1 usually denoted by GF(q) (independently of how it is constructed).
0 0 1 x x+1 x2 x2 + 1 x2 + x x2 + x + 1
1 0 x+1 x 2
x +1 x2 2
x +x+1 x2 + x Theorem 5.38. For every prime p and every d ≥ 1 there exists an irreducible polynomial of
x 0 1 x2 + x x2 + x + 1 x2 x2 + 1 degree d in GF(p)[x]. In particular, there exists a finite field with pd elements.
x+1 0 x2 + x + 1 x2 + x 2
x +1 x2
The following theorem states that the finite fields we have seen so far, Zp for prime
x2 0 1 x x+1
2 p and GF(p)[x]m(x) for an irreducible m(x), are all finite fields. There are no other finite
x +1 0 x+1 x fields. Moreover, one obtains no new finite fields by taking an irreducible polynomial,
x2 + x 0 1 ′
say of degree d′ , over some extension field GF(pd ), resulting in the field GF(pdd ). Such
x2 + x + 1 0
a field can always be constructed directly using an irreducible polynomial of degree dd′
over GF(p).
Figure 5.2: The addition table for GF(8) constructed with the irreducible poly-
nomial x3 + x + 1. Theorem 5.39. There exists a finite field with q elements if and only if q is a power of a prime.
Moreover, any two finite fields of the same size q are isomorphic.
The last claim justifies the use of the notation GF(q) without making explicit how the
· 0 1 x x+1 x2 x2 + 1 x2 + x x2 + x + 1
field is constructed. Different constructions yield different representations (naming) of
0 0 0 0 0 0 0 0 0 the field elements, but not different fields. However, it is possible that some representa-
1 1 x x+1 x2 x2 + 1 x2 + x x2 + x + 1 tions are better suited than others for the efficient hardware or software implementation
x x2 x2 + x x+1 1 x2 + x + 1 x2 + 1 of the field arithmetic.
x+1 x2 + 1 x2 + x + 1 x2 1 x
x2 x2 + x x x2 + 1 1 Theorem 5.40. The multiplicative group of every finite field GF(q) is cyclic.
x2 + 1 x2 + x + 1 x+1 x2 + x
Note that the multiplicative group of GF(q) has order q − 1 and has ϕ(q − 1) genera-
x2 + x x x2
tors.
x2 + x + 1 x+1
Example 5.65. One can check that the fields GF(22 ) and GF(23 ) have multiplicative
groups of orders 3 and 7, which are both prime. Therefore all elements except 1 (and
Figure 5.3: The multiplication table for GF(8) constructed with the irreducible
0 of course) are generators of the multiplicative group.
polynomial x3 + x + 1.
Example 5.64. The polynomial x3 + x + 1 over GF(2) is irreducible since it has 5.9 Application: Error-Correcting Codes
no roots (it evaluates to 1 for both 0 and 1). The field GF(8) (also written GF(23 ))
consists of the 8 polynomials of degree ≤ 2 over GF(2). The tables for addition 5.9.1 Definition of Error-Correcting Codes
and multiplication are shown in Figures 5.2 and 5.3. In this field we have, for
example, Finite fields are of great importance in Computer Science and have many appli-
cations, one of which is discussed in this section.
(x + 1)/(x2 + 1) = (x + 1)(x2 + 1)−1 = (x + 1)x = x2 + x. Error-correcting codes are used in many communication protocols and other
applications. For example, the digital information on a CD is stored in such a
manner that even if some of the information is lost (e.g. because of a scratch or
5.8.3 Some Facts About Finite Fields * dirt on the disc), the entire information can still be reconstructed without quality
degradation, as long as sufficiently much of the data is still available.
Theorem 5.37 gives us a method for constructing a new field from an existing field F ,
48 His interest was in proving that polynomial equations over R of fifth or higher degree have in
provided we can find an irreducible polynomial m(x) in F [x]. When F is a finite field,
then so is F [x]m(x) . The proofs of most facts stated in this section are beyond the scope general no closed form solution in radicals (while equations of up to fourth degree do). His major
contributions to mathematics were recognized by the leading mathematicians only many years after
of this course. his death. He died in a duel at the age of 21. The story goes that he wrote down major parts of his
The theory of finite fields was founded by the French mathematician Evariste Galois theory during the last night before the duel.
125 Chapter 5. Algebra 5.9. Application: Error-Correcting Codes 126
There are two types of problems that can occur in data transmission or when The idea is that a good decoding function takes an arbitrary list
reading data from a storage medium. First, data can be erased, meaning that (r0 , . . . , rn−1 ) ∈ An of symbols49 and decodes it to the most plausible (in some
when reading (or receiving) it one realizes that it is missing. Second, data can sense) information vector (a0 , . . . , ak−1 ). Moreover, a good decoding function
contain errors. The second type of problem is more severe because it is not even should be efficiently computable.
known where in a data stream the errors occurred. A good error-correcting The error-correcting capability of a code C can be characterized in terms of
scheme can handle both problems. the number t of errors that can be corrected. More precisely:
Definition 5.35. A (n, k)-encoding function E for some alphabet A is an injective Definition 5.40. A decoding function D is t-error correcting for encoding func-
function that maps a list (a0 , . . . , ak−1 ) ∈ Ak of k (information) symbols to a list tion E if for any (a0 , . . . , ak−1 )
(c0 , . . . , cn−1 ) ∈ An of n > k (encoded) symbols in A, called codeword:
D (r0 , . . . , rn−1 ) = (a0 , . . . , ak−1 )
E : Ak → An : (a0 , . . . , ak−1 ) 7→ E((a0 , . . . , ak−1 )) = (c0 , . . . , cn−1 ).
for any (r0 , . . . , rn−1 ) with Hamming distance at most t from E (a0 , . . . , ak−1 ) .
For an encoding function E one often consider the set A code C is t-error correcting if there exists E and D with C = Im(E) where D is
t-error correcting.
C = Im(E) = {E((a0 , . . . , ak−1 )) | a0 , . . . , ak−1 ∈ A}
of codewords, which is called an error-correcting code. Theorem 5.41. A code C with minimum distance d is t-error correcting if and only if
d ≥ 2t + 1.
Definition 5.36. An (n, k)-error-correcting code over the alphabet A with |A| = q
is a subset of An of cardinality q k .
It is natural to use as the alphabet A = {0, 1}, i.e., to take bits as the basic unit Proof. (⇐=) If any two codewords have Hamming distance at least 2t + 1 (i.e.,
of information. However, for several reasons (one being the efficiency of encod- differ in at least 2t+1 positions), then it is impossible that a word (r0 , . . . , rn−1 ) ∈
ing and in particular decoding), one often considers larger units of information, An could result from two different codewords by changing t positions. Thus if
for example bytes (i.e., A = {0, 1}8). (r0 , . . . , rn−1 ) has distance at most t from a codeword (c0 , . . . , cn−1 ), then this
codeword is uniquely determined. The decoding function D can be defined to
Definition 5.37. The Hamming distance between two strings of equal length over decode to (one of) the nearest codeword(s) (more precisely, to the information
a finite alphabet A is the number of positions at which the two strings differ. resulting (by E) in that codeword).
(=⇒) If there are two codewords that differ in at most 2t positions, then there
exists a word (r0 , . . . , rn−1 ) which differs from both codewords in at most t po-
Definition 5.38. The minimum distance of an error-correcting code C, denoted
sitions; hence it is possible that t errors can not be corrected.
dmin (C), is the minimum of the Hamming distance between any two codewords.
Example 5.66. The following code is a (5, 2)-code over the alphabet {0, 1}: Example 5.67. A code with minimum distance d = 5 can correct t = 2 errors.
The code in Example 5.66 can correct a single error (t = 1).
{(0, 0, 0, 0, 0), (1, 1, 1, 0, 0), (0, 0, 1, 1, 1), (1, 1, 0, 1, 1)}.
Theorem 5.42. Let A = GF(q) and let α0 , . . . , αn−1 be arbitrary distinct elements of
GF(q). Consider the encoding function
E((a0 , . . . , ak−1 )) = a(α0 ), . . . , a(αn−1 ) ,
Proof. The polynomial a(x) of degree k−1 can be interpolated from any k values,
i.e., from any k codeword symbols. If two polynomials agree for k arguments
Logic
(or, equivalently, if two codewords agree at k positions), then they are equal.
This means that two different codewords cannot agree at k positions. Hence any
two codewords disagree in at least n − k + 1 positions.
6.1 Introduction
An (n, k)-code over the field GF(2d ) can be interpreted as a binary (dn, dk)-
code (over GF(2)). The minimum distance of this binary code is at least that of
In Chapter 2 we have introduced some basic concepts of logic, but the treat-
the original code because two different GF(2d )-symbols must differ in at least
ment was quite informal. In this chapter we discuss the foundations of logic
one bit (but can of course differ in more than one bit).
in a mathematically rigorous manner. In particular, we clearly distinguish be-
Example 5.68. Polynomial codes as described are used for storing information tween the syntax and the semantics of a logic and between syntactic derivations
on Compact Discs. In fact, the coding scheme of CD’s makes use of two different of formulas and logical consequences they imply. We also introduce the con-
such codes, but explaining the complete scheme is beyond the scope of this cept of a logical calculus and define soundness and completeness of a calculus.
course on discrete mathematics. The field is GF(28 ) defined by an irreducible Moreover, we discuss in detail a concrete calculus for propositional logic, the
polynomial of degree 8 over GF(2) and the two codes are a (32, 28)-code over so-called resolution calculus.
GF(28 ) and a (28, 24)-code over GF(28 ), both with minimum distance 5. At a very general level, the goal of logic is to provide a framework for ex-
pressing mathematical statements and for expressing and verifying proofs for
such statements. A more ambitious, secondary goal can be to provide tools for
automatically or semi-automatically generating a proof for a given statement.
A treatment of logic usually begins with a chapter on propositional logic1
(see Section 6.5), followed by a chapter on predicate (or first-order) logic2 (see
Section 6.6), which can be seen as an extension of propositional logic. There are
several other logics which are useful in Computer Science and in mathematics,
including temporal logic, modal logic, intuitionistic logic, and logics for rea-
soning about knowledge and about uncertainty. Most if not all relevant logics
contain the logical operators from propositional logic, i.e., ∧, ∨, ¬ (and the de-
rived operators → and ↔), as well as the quantifiers (∀ and ∃) from predicate
logic.
Our goal is to present the general concepts that apply to all types of log-
ics in a unified manner, and then to discuss the specific instantiations of these
1 German: Aussagenlogik
2 German: Prädikatenlogik
128
129 Chapter 6. Logic 6.2. Proof Systems 130
concepts for each logic individually. Therefore we begin with such a general assigns to each s ∈ S its truth value τ (s). This function τ defines the meaning,
treatment (see Sections 6.2, 6.3, and 6.4) before discussing propositional and called the semantics, of objects in S.5
predicate logic. From a didactic viewpoint, however, it will be useful to switch An element p ∈ P is either a (valid) proof for a statement s ∈ S, or it is not.
back and forth between the generic concepts of Sections 6.2, 6.3, and 6.4 and the This can be defined via a verification function
concrete instantiations of Sections 6.5 and 6.6.
We give a general warning: Different treatments of logic often use slightly or φ : S × P → {0, 1},
sometimes substantially different notation.3 Even at the conceptual level there
are significant differences. One needs to be prepared to adopt a particular no-
where φ(s, p) = 1 means that p is a valid proof for statement s.
tation used in a particular application context. However, the general principles
explained here are essentially standard. Without strong loss of generality we can in this section consider
We also refer to the book by Kreuzer and Kühling and that by Schöning
mentioned in the preface of these lecture notes. S = P = {0, 1}∗,
with the understanding that any string in {0, 1}∗ can be interpreted as a state-
ment by defining syntactically wrong statements as being false statements.
6.2 Proof Systems
Definition 6.1. A proof system6 is a quadruple Π = (S, P, τ, φ), as above.
6.2.1 Definition
We now discuss the two fundamental requirements for proof systems.
In a formal treatment of mathematics, all objects of study must be described in
a well-defined syntax. Typically, syntactic objects are finite strings over some
alphabet Σ, for example the symbols allowed by the syntax of a logic or simply Definition 6.2. A proof system Π = (S, P, τ, φ) is sound7 if no false statement
the alphabet {0, 1}, in which case syntactic objects are bit-strings. Recall that Σ∗ has a proof, i.e., if for all s ∈ S for which there exists p ∈ P with φ(s, p) = 1, we
denotes the set of finite strings of symbols from Σ. have τ (s) = 1.
In this section, the two types of mathematical objects we study are
Definition 6.3. A proof system Π = (S, P, τ, φ) is complete8 if every true state-
• mathematical statements of a certain type and ment has a proof, i.e., if for all s ∈ S with τ (s) = 1, there exists p ∈ P with
φ(s, p) = 1.
• proofs for this type of statements.
In addition to soundness and completeness, one requires that the function
φ be efficiently computable (for some notion of efficiency).9 We will not make
By a statement type we mean for example the class of statements of the form
this formal, but it is obvious that a proof system is useless if proof verification is
that a given number n is prime, or the class of statements of the form that a
computationally infeasible. Since the verification has to generally examine the
given graph G has a Hamiltonian cycle (see below), or the class of statements of
entire proof, the length of the proof cannot be infeasibly long.10
the form that a given formula F in propositional logic is satisfiable.
Consider a fixed type of statements. Let S ⊆ Σ∗ be the set of (syntactic 5 In the context of logic discussed from the next section onwards, the term semantics is used in a
representations of) mathematical statements of this type, and let P ⊆ Σ∗ be the specific restricted manner that is compatible with its use here.
set of (syntactic representations of) proof strings.4 6 The term proof system is also used in different ways in the mathematical literature.
7 German: korrekt
Every statement s ∈ S is either true or false. The truth function 8 German: vollständig
9 The usual efficiency notion in Computer Science is so-called polynomial-time computable which
τ : S → {0, 1} we do not discuss further.
10 An interesting notion introduced in 1998 by Arora et al. is that of a probabilistically checkable proof
3 For example, in some treatments the symbol ⇒ is used for →, which can be confusing. (PCP). The idea is that the proof can be very long (i.e., exponentially long), but that the verification
4 Membership in S and also in P is assumed to be efficiently checkable (for some notion of effi- only examines a very small random selection of the bits of the proof and nevertheless can decide
ciency). correctness, except with very small error probability.
131 Chapter 6. Logic 6.2. Proof Systems 132
6.2.2 Examples of s encode the adjacency matrix of a graph not containing Hamiltonian cycle.
In this case, no sound and complete proof system (with reasonably short and
Example 6.1. An undirected graph consists of a set V of nodes and a set E of efficiently verifiable proofs) is known. It is believed that no such proof system
edges between nodes. Suppose that V = {0, . . . , n − 1}. A graph can then be exists.
described by the so-called adjacency matrix, an n×n-matrix M with {0, 1}-entries,
where Mi,j = 1 if and only if there is an edge between nodes i and j. A graph Example 6.3. Let again S = P = {0, 1}∗ , and for s ∈ {0, 1}∗ let n(s) de-
with n nodes can hence be represented by a bit-string of length n2 , by reading note the natural number whose (standard) binary representation is s, with the
out the entries of the matrix row by row. convention that leading 0’s are ignored. (For example, n(101011) = 43 and
We are now interested in proving that a given graph has a so-called Hamilto- n(00101) = 5.) Now, let τ be the function defined as follows: τ (s) = 1 if and
nian cycle, i.e., that there is a closed path from node 1 back to node 1, following only if n(s) is not a prime number. Moreover, let φ be the function defined
edges between nodes, and visiting every node exactly once. We are also inter- by φ(s, p) = 1 if and only if n(s) = 0, or if n(s) = 1, or if n(p) divides n(s) and
ested in the problem of proving the negation of this statement, i.e., that a given 1 < n(p) < n(s). This function φ is efficiently computable. This is a proof system
graph has no Hamiltonian cycle. Deciding whether or not a given graph has a for the non-primality (i.e., compositeness) of natural numbers. It is sound be-
Hamiltonian cycle is considered a computationally very hard decision problem cause every s corresponding to a prime number n(s) has no proof since n(s) 6= 0
(for large graphs).11 and n(s) 6= 1 and n(s) has no divisor d satisfying 1 < d < n(s). The proof sys-
tem is complete because every natural number n greater than 1 is either prime
To prove that a graph has a Hamiltonian cycle, one can simply provide the or has a prime factor q satisfying 1 < q < n (whose binary representation can
sequence of nodes visited by the cycle. A value in V = {0, . . . , n − 1} can be serve as a proof).
represented by a bit-string of length ⌈log2 n⌉, and a sequence of n such numbers
can hence be represented by a bit-string of length n⌈log2 n⌉. We can hence define Example 6.4. Let us consider the opposite problem, i.e., proving primality of
S = P = {0, 1}∗. a number n(s) represented by s. In other words, in the previous example we
Now we can let τ be the function defined by τ (s) = 1 if and only if |s| = n2 replace “not a prime” by “a prime”. It is far from clear how one can define a
for some n and the n2 bits of s encode the adjacency matrix of a graph containing verification function φ such that the proof system is sound and complete. How-
a Hamiltonian cycle. If |s| is not a square or if s encodes a graph without a ever, such an efficiently computable function φ indeed exists. Very briefly, the
Hamiltonian cycle, then τ (s) = 0.12 Moreover, we can let φ be the function proof that a number n(s) (henceforth we simply write n) is prime consists of
defined by φ(s, p) = 1 if and only if, when s is interpreted as an n × n-matrix (adequate representations of):
M and when p is interpreted as a sequence of n different numbers (a1 , . . . , an )
1) the list p1 , . . . , pk of distinct prime factors of n − 1,
with ai ∈ {0, . . . , n − 1} (each encoded by a bit-string of length ⌈log2 n⌉), then
the following is true: 2) a (recursive) proof of primality for each of p1 , . . . , pk 13
Mai ,ai+1 = 1 3) a generator g of the group Z∗n .
for i = 1, . . . , n − 1 and The exact representation of these three parts of the proof would have to be made
Man ,a1 = 1. precise, but we omit this here as it is obvious how this could be done.
This function φ is efficiently computable. The proof system is sound because a The verification of a proof (i.e., the computation of the function φ) works as
graph without Hamiltonian cycle has no proof, and it is complete because every follows.
graph with a Hamiltonian cycle has a proof. Note that each s with τ (s) = 1 has
at least n different proofs because the starting point in the cycle is arbitrary. • If n = 2 or n = 3, then the verification stops and returns 1.14
Example 6.2. Let us now consider the opposite problem of proving the inex- • It is tested whether p1 , . . . , pk all divide n − 1 and whether n − 1 can be
istence of a Hamiltonian cycle in a given graph. In other words, in the above written as a product of powers of p1 , . . . , pk (i.e., whether n − 1 contains
example we define τ (s) = 1 if and only if |s| = n2 for some n and the n2 bits no other prime factor).
11 The best known algorithm has running time exponential in n. The problem is actually NP- 13 recursive means that the same principle is applied to prove the primality of every p , and again
i
complete, a concept that will be discussed in a later course on theoretical Computer Science. for every prime factor of pi − 1, etc.
12 Note that τ defines the meaning of the strings in S, namely that they are meant to encode graphs 14 One could also consider a longer list of small primes for which no recursive primality proof is
and that we are interested in whether a given graph has a Hamiltonian cycle. required.
133 Chapter 6. Logic 6.3. Elementary General Concepts in Logic 134
6.3.1 The General Goal of Logic Definition 6.5. The semantics of a logic defines (among other things, see below)
a function free which assigns to each formula F = (f1 , f2 , . . . , fk ) ∈ Λ∗ a subset
A goal of logic is to provide a specific proof system Π = (S, P, τ, φ) for which free(F ) ⊆ {1, . . . , k} of the indices. If i ∈ free(F ), then the symbol fi is said to
a very large class of mathematical statements can be expressed as an element occur free in F .20
of S.
The same symbol β ∈ Λ can occur free in one place of F (say f3 = β where
However, such a proof system Π = (S, P, τ, φ) can never capture all possible 3 ∈ free(F )) and not free in another place (say f5 = β where 5 6∈ free(F )).
mathematical statements. For example, it usually does not allow to capture The free symbols of a formula denote kind of variables which need to be
(self-referential) statements about Π, such as “Π is complete”, as an element of assigned fixed values in their respective associated domains before the formula
S. The use of common language is therefore unavoidable in mathematics and has a truth value. This assignment of values is called an interpretation:
logic (see also Section 6.7).
In logic, an element s ∈ S consists of one or more formulas (e.g. a formula, Definition 6.6. An interpretation consists of a set Z ⊆ Λ of symbols of Λ, a do-
or a set of formulas, or a set of formulas and a formula), and a proof consists of main (a set of possible values) for each symbol in Z, and a function that assigns
applying a certain sequence of syntactic steps, called a derivation or a deduction. to each symbol in Z a value in its associated domain.21
Each step consists of applying one of a set of allowed syntactic rules, and the
set of allowed rules is called a calculus. A rule generally has some place-holders Often (but not in propositional logic), the domains are defined in terms of a
that must be instantiated by concrete values. so-called universe U , and the domain for a symbol in Λ can for example be U , or
a function U k → U (for some k), or a function U k → {0, 1} (for some k).
In standard treatments of logic, the syntax of S and the semantics (the func-
tion τ ) are carefully defined. In contrast, the function φ, which consists of ver-
Definition 6.7. An interpretation is suitable22 for a formula F if it assigns a value
ifying the correctness of each rule application step, is not completely explicitly
to all symbols β ∈ Λ occurring free in F .23
defined. One only defines rules, but for example one generally does not define
a syntax for expressing how the place-holders of the rules are instantiated.17
Definition 6.8. The semantics of a logic also defines a function24 σ assigning to
each formula F , and each interpretation A suitable for F , a truth value σ(F, A)
in {0, 1}.25 In treatments of logic one often writes A(F ) instead of σ(F, A) and
6.3.2 Syntax, Semantics, Interpretation, Model calls A(F ) the truth value of F under interpretation A.26
A logic is defined by the syntax and the semantics. The basic concept in any logic
is that of a formula18 .
Definition 6.4. The syntax of a logic defines an alphabet Λ (of allowed symbols)
and specifies which strings in Λ∗ are formulas (i.e., are syntactically correct).
Some of the symbols in Λ (e.g. the symbols A and B in propositional logic logic, but as we see later it coincides for the notion of free variables in predicate logic.
21 There may be restrictions for what is an allowed interpretation.
or the symbols P and Q in predicate logic) are understood as variables, each of 22 German: passend
which can take on a value in a certain domain associated to the symbol. 23 A suitable interpretation can also assign values to symbols β ∈ Λ not occurring free in F .
24 We assume that the set of formulas and the set of interpretations are well-defined.
25 Note that different free occurrences of a symbol β ∈ Λ in F are assigned the same value, namely
Definition 6.9. A (suitable) interpretation A for which a formula F is true, (i.e., Definition 6.11. A formula F is called a tautology30 or valid31 if it is true for every
A(F ) = 1) is called a model for F , and one also writes suitable interpretation. The symbol ⊤ is used for a tautology.
A |= F. The symbol ⊥ is sometimes called falsum, and ⊤ is sometimes called verum.
More generally, for a set M of formulas, a (suitable) interpretation for which all
formulas in M are true is called a model for M , denoted as Definition 6.12. A formula G is a logical consequence32 of a formula F (or a set
M of formulas), denoted
A |= M. F |= G (or M |= G ),
If A is not a model for M one writes A 6|= M. if every interpretation suitable for both F (or M ) and G, which is a model for F
(for M ), is also a model for G.33
6.3.3 Connection to Proof Systems * Definition 6.13. Two formulas F and G are equivalent, denoted F ≡ G, if every
interpretation suitable for both F and G yields the same truth value for F and
We now explain how the semantics of a logic (the function σ in Definition 6.8) is con- G, i.e., if each one is a logical consequence of the other:
nected to the semantics of a proof systems (the function τ in Definition 6.1).
def
First we should remark that one can treat logic in a similarly informal manner as F ≡G ⇐⇒ F |= G and G |= F.
one treats other fields of mathematics. There can be variations on how much is formal-
ized in the sense of proof systems. Concretely, there are the following two options for A set M of formulas can be interpreted as the conjunction (AND) of all for-
formalizing a logic: mulas in M since an interpretation is a model for M if and only if it is a model
for all formulas in M .34 If M is the empty set, then, by definition, every in-
• In addition to formulas, also interpretations are considered to be formal objects,
i.e., there is a syntax for writing (at least certain types of) interpretations. In this
terpretation is a model for M , i.e., the empty set of formulas corresponds to a
case, statements can correspond to pairs (F, A), and the function σ corresponds to tautology.
the function τ (in the sense of proof systems).
• Only formulas are formal objects and interpretations are treated informally, e.g. in Definition 6.14. If F is a tautology, one also writes |= F .
words or some other informal notation. This is the typical approach in treatments
of logic (also in this course). This makes perfect sense if the formal statements one
wants to prove only refer to formulas, and not to specific interpretations. Indeed,
many statements about formulas are of this form, for example the statement that a 6.3.5 The Logical Operators ∧, ∨, and ¬
formula F is a tautology, the statement that F is satisfiable (or unsatisfiable), or the
statement that a formula G is a logical consequence of a formula F , i.e., F |= G. Essentially all logics contain the following recursive definitions as part of the
Note that to prove such statements it is not necessary to formalize interpretations. syntax definition.
in M is satisfiable. also with an interpretation on the left side. This makes sense because one can consider a set M of
29 The symbol ⊥ is not a formula itself, i.e., it is not part of the syntax of a logic, but if used in formulas as defining a set of interpretations, namely the set of models for M .
expressions like F ≡ ⊥ it is to be understood as standing for an arbitrary unsatisfiable formula. For 34 More formally, let G be any formula (one of the many equivalent ones) that corresponds to the
example, F ≡ ⊥ means that F is unsatisfiable. conjunction of all formulas in M . Then M |= F if and only if G |= F .
139 Chapter 6. Logic 6.3. Elementary General Concepts in Logic 140
We introduce some notational conventions for the use of parentheses. The either A(F ) = 0 or A(G) = 0, i.e., if and only if either A(¬F ) = 1 or A(¬G) = 1,
outermost parentheses of a formula can be dropped, i.e., we can write F ∧ G and hence if and only if A(¬F ∨ ¬G) = 1.
instead of (F ∧ G). Moreover, parentheses not needed because of associativity
of ∧ or ∨ (which is actually a consequence of the semantics defined below) can
also be dropped. 6.3.6 Logical Consequence vs. Unsatisfiability
The implication introduced in Section 2.3 can be understood simply as a We state the following facts without proofs, which are rather obvious. These
notational convention: F → G stands for ¬F ∨G.35 Similarly, the symbol F ↔ G lemmas are needed for example to make use of the resolution calculus (see Sec-
stands for (F ∧ G) ∨ (¬F ∧ ¬G). tion 6.5.6), which allows to prove the unsatisfiability of a set of formulas, to also
The semantics of the logical operators ∧, ∨, and ¬ is defined as follows (in be able to prove that a formula F is a tautology, or to prove that a formula G is
any logic where these operators exist): logical consequence of a given set {F1 , F2 , . . . , Fk } of formulas.
Consider two theories T and T ′ , where T ′ contains all the axioms of T plus las. This is sometimes called a Hilbert-style calculus. There is also another type
one or more additional axioms. Then every theorem in T is also a theorem in T ′ of calculi, often called sequent calculi (which we will not discuss in this course),
(but not vice versa). In the special case where T = ∅, a theorem in T = ∅ is a where the syntactic objects are more complex objects than just formulas. The
tautology in the logic. Tautologies are useful because they are theorems in any following refers to Hilbert-style calculi.
theory, i.e., for any set of axioms.
Definition 6.17. A derivation rule or inference rule36 is a rule for deriving a for-
Example 6.5. The formula ¬∃x∀y P (y, x) ↔ ¬P (y, y) is a tautology in predi- mula from a set of formulas (called the precondition or premises). We write
cate logic, as proved in Section 6.6.9.
{F1 , . . . , Fk } ⊢R G
6.4 Logical Calculi if G can be derived from the set {F1 , . . . , Fk } by rule R.37
F1 F2 · · · Fk
As mentioned in Section 6.3.1, the goal of logic is to provide a framework for ex- (R),
pressing and verifying proofs of mathematical statements. A proof of a theorem G
should be a purely syntactic derivation consisting of simple and easily verifiable where spaces separate the formulas above the bar.
steps. In each step, a new syntactic object (typically a formula, but it can also be Derivation is a purely syntactic concept. Derivation rules apply to syntac-
a more general object involving formulas) is derived by application of a deriva- tically correct (sets of) formulas. Some derivation rules (e.g. resolution, see
tion rule or inference rule, and at the end of the derivation, the desired theorem Section 6.5.6) require the formulas to be in a specific format.
appears. The syntactic verification of a proof does not require any intelligence
or “reasoning between the lines”, and it can in particular be performed by a Typically such a derivation rule is defined as a rule that involves place-holders
computer. for formulas (such as F and G), which can be instantiated with any concrete
formulas. In order to apply such a rule one must instantiate each place-holder
Checking a proof hence simply means to execute a program. Like in com- with a concrete formula.
puter programming, where the execution of a program is a dumb process while
the design of a program is generally an intelligent, sometimes ingenious pro- Definition 6.18. (Informal.) The application of a derivation rule R to a set M of
cess, the verification of a proof should be a dumb process while devising a proof formulas means
is an intelligent, creative, and sometimes ingenious process.
A well-defined set of rules for manipulating formulas (the syntactic objects) 1. Select a subset N of M .
is called a calculus. Many such calculi have been proposed, and they differ in 2. For the place-holders in R: specify formulas that appear in N such that
various ways, for example in the syntax, the semantics, the expressiveness, how N ⊢R G for a formula G.
easy or involved it is to write a proof, and how long a proof will be.
When defining a calculus, there is a trade-off between simplicity (e.g. a small 3. Add G to the set M (i.e., replace M by M ∪ {G}).
number of rules) and versatility. For a small set of rules, proving even sim-
ple logical steps (like the substitution of a sub-formula by an equivalent sub- Example 6.6. Two derivation rules for propositional and predicate logic are
formula) can take a very large number of steps in the calculus.
{F ∧ G} ⊢ F and {F, G} ⊢ F ∧ G
It is beyond the scope of this course to provide an extensive treatment of
various logical calculi. The left rule states that if one has already derived a formula of the form F ∧ G,
where F and G are arbitrary formulas, then one can derive F . The second rule
6.4.2 Hilbert-Style Calculi states that for any two formulas F and G that have been derived, one can also
36 German: Schlussregel
As mentioned, there are different types of logical calculi. For the perhaps most 37 Formally, a derivation rule is a relation from the power set of the set of formulas to the set of
intuitive type of calculus, the syntactic objects that are manipulated are formu- formulas.
143 Chapter 6. Logic 6.4. Logical Calculi 144
derive the formula F ∧ G. For example, an application of the right rule yields Example 6.7. The two rules of Example 6.6 are correct, but the rule
{A ∨ B, C ∨ D} ⊢ (A ∨ B) ∧ (C ∨ D), {F → G, G → F } ⊢ F ∧ G
where F is instantiated as A ∨ B and G is instantiated as C ∨ D. More rules are is not correct. To see this, note that if F and G are both false, then F → G and
discussed in Section 6.5.5. G → F are true while F ∧ G is false.
6.4.5 Connection to Proof Systems * Definition 6.24. (Semantics.) For a set Z of atomic formulas, an interpretation
A, called truth assignment44 , is a function A : Z → {0, 1}. A truth assignment A
Let us briefly explain the connection between logical calculi and the general concept of is suitable for a formula F if Z contains all atomic formulas appearing in F (see
proof systems (Definition 6.2). Definition 6.7). The semantics (i.e., the truth value A(F ) of a formula F under
In a proof system allowing to prove that a formula F is a tautology, one can let the set interpretation A) is defined by A(F ) = A(Ai ) for any atomic formula F = Ai ,
S of statements be the set of formulas. One further needs a precise syntax for expressing and by Definition 6.16 (restated here for convenience):
derivations. Such a syntax would, for example, have to include a way to express how
place-holders in rules are instantiated. This aspect of a calculus is usually not made A((F ∧ G)) = 1 if and only if A(F ) = 1 and A(G) = 1.
precise and therefore a logical calculus (alone) does not completely constitute a proof
A((F ∨ G)) = 1 if and only if A(F ) = 1 or A(G) = 1.
system in the sense of Definition 6.2. However, in a computerized system this needs to
be made precise, in a language specific to that system, and such computerized system is A(¬F ) = 1 if and only if A(F ) = 0.
hence a proof system in the strict sense of Section 6.2.
Example 6.8. Consider the formula
We also refer to Section 2.3 where some basics of propositional logic were intro- already discussed in Section 2.3. The truth assignment A : M → {0, 1} for
duced informally and many examples were already given. This section concen- M = {A, B} that assigns A(A) = 0 and A(B) = 1 is not suitable for F because
trates on the formal aspects and the connection to Section 6.3. no truth value is assigned to C, and the truth assignment A : M → {0, 1} for
M = {A, B, C, D} that assigns A(A) = 0, A(B) = 1, A(C) = 0, and A(D) = 1 is
suitable and also a model for F . F is satisfiable but not a tautology.
6.5.1 Syntax
6.5.3 Brief Discussion of General Logic Concepts
Definition 6.23. (Syntax.) An atomic formula is a symbol of the form Ai with
i ∈ N.43 A formula is defined as follows, where the second point is a restatement We briefly discuss the basic concepts from Section 6.3.4 in the context of propo-
(for convenience) of Definition 6.15: sitional logic.
• An atomic formula is a formula. Specializing Definition 6.13 to the case of propositional logic, we confirm
Definition 2.6: Two formulas F and G are equivalent if, when both formulas
• If F and G are formulas, then also ¬F , (F ∧G), and (F ∨G) are formulas. are considered as functions M → {0, 1}, where M is the union of the atomic
A formula built according to this inductive definition corresponds naturally formulas of F and G, then the two functions are identical (i.e., have the same
to a tree where the leaves correspond to atomic formulas and the inner nodes function table).
correspond to the logical operators. Specializing Definition 6.12 to the case of propositional logic, we see that G
is a logical consequence of F , i.e., F |= G, if the function table of G contains a 1
for at least all argument for which the function table of F contains a 1.45
6.5.2 Semantics Example 6.9. F = (A ∧ ¬B) ∨ (B ∧ ¬C) is a logical consequence of A and ¬C,
i.e., {A, ¬C} |= F . In contrast, F is not a logical consequence of A and B, i.e.,
Recall Definitions 6.5 and 6.6. In propositional logic, the free symbols of a formula {A, B} 6|= F .
are all the atomic formulas. For example, the truth value of the formula A ∧ B is
determined only after we specify the truth values of A and B. In propositional The basic equivalences of Lemma 6.1 apply in particular to propositional
logic, an interpretation is called a truth assignment (see below). logic.
44 German: (Wahrheits-)Belegung
43 A is usually not used. This definition guarantees an unbounded supply of atomic formulas, 45 If
the truth values 0 and 1 were interpreted as numbers, then F |= G means that G is greater or
0
but as a notational convention we can also write A, B, C, . . . instead of A1 , A2 , A3 , . . .. equal to F for all arguments. This also explains why F |= G and G |= H together imply F |= H.
147 Chapter 6. Logic 6.5. Propositional Logic 148
6.5.4 Normal Forms disjunctions. F is false if and only if all the disjunctions are false, i.e., the truth
table of this formula in CNF is identical to that of F .
Definition 6.25. A literal is an atomic formula or the negation of an atomic for- Example 6.12. Consider the formula F = (A ∧ ¬B) ∨ (B ∧ ¬C) from above.
mula. The function table is
Definition 6.26. A formula F is in conjunctive normal form (CNF) if it is a con- A B C (A ∧ ¬B) ∨ (B ∧ ¬C)
junction of disjunctions of literals, i.e., if it is of the form 0 0 0 0
0 0 1 0
F = (L11 ∨ · · · ∨ L1m1 ) ∧ · · · ∧ (Ln1 ∨ · · · ∨ Lnmn ) 0 1 0 1
0 1 1 0
for some literals Lij . 1 0 0 1
1 0 1 1
Example 6.10. The formula (A ∨ ¬B) ∧ (¬A ∨ B ∨ ¬D) ∧ ¬C is in CNF. 1 1 0 1
1 1 1 0
Definition 6.27. A formula F is in disjunctive normal form (DNF) if it is a disjunc- We therefore obtain the following DNF
tion of conjunctions of literals, i.e., if it is of the form
F ≡ (¬A ∧ B ∧ ¬C) ∨ (A ∧ ¬B ∧ ¬C) ∨ (A ∧ ¬B ∧ C) ∨ (A ∧ B ∧ ¬C)
F = (L11 ∧ · · · ∧ L1m1 ) ∨ · · · ∨ (Ln1 ∧ · · · ∧ Lnmn )
as the disjunction of 4 conjunctions. And we obtain the following CNF
for some literals Lij .
F ≡ (A ∨ B ∨ C) ∧ (A ∨ B ∨ ¬C) ∧ (A ∨ ¬B ∨ ¬C) ∧ (¬A ∨ ¬B ∨ ¬C).
Example 6.11. The formula (B ∧ C) ∨ (¬A ∧ B ∧ ¬C) is in DNF. as the conjunction of 4 disjunctions.
6.5.5 Some Derivation Rules Which set of rules constitutes an adequate calculus is generally not clear, but
some calculi have received special attention. One could argue both for a small
In this section we discuss a few derivation rules for propositional logic and any set of rules (which are considered the fundamental ones from which everything
logic which contains propositional logic. We do not provide a complete and else is derived) or for a large library of rules (so there is a large degree of freedom
compactly defined calculus, just a few rules. For singleton sets of formulas we in finding a short derivation).
omit the brackets “{” and “}”.
All equivalences, including the basic equivalences of Lemma 6.1, can be used
as derivation rules. For example, the following derivation rules are correct:
6.5.6 The Resolution Calculus for Propositional Logic
¬¬F ⊢ F F ∧G ⊢ G∧F ¬(F ∨ G) ⊢ ¬F ∧ ¬G Resolution is an important logical calculus that is used in certain computer al-
gorithms for automated reasoning. The calculus is very simple in that it consists
Other natural and correct rules, which capture logical consequences, not equiv- of a single derivation rule. The purpose of a derivation is to prove that a given
alences, are: set M of formulas (or, equivalently, their conjunction) is unsatisfiable.
F ∧G ⊢F F ∧G ⊢ G {F, G} ⊢ F ∧ G As mentioned earlier (see Lemma 6.2), this also allows to prove that a for-
mula F is a tautology, which is the case if and only if ¬F is unsatisfiable. It also
F ⊢ F ∨G F ⊢ G∨F allows to prove that a formula F is a logical consequence of a set M of formulas
{F, F → G} ⊢ G {F ∨ G, F → H, G → H} ⊢ H. (i.e., M |= F ), as this is the case if and only if the set M ∪ {¬F } is unsatisfiable
(see Lemma 6.3).
Such rules are not necessarily independent. For example, the rule F ∧ G ⊢ The resolution calculus assumes that all formulas of M are given in conjunc-
G ∧ F could be derived from the above three rules as follows: F can be derived tive normal form (CNF, see Definition 6.26). This is usually not the case, and
from F ∧ G and G can also be derived from F ∧ G, resulting in the set {F ∧ therefore the formulas of M must first be transformed into equivalent formulas
G, F, G}. {G, F } is a subset of {F ∧ G, F, G} and hence one of the above rules in CNF, as explained earlier. Moreover, instead of working with CNF-formulas
yields {G, F } ⊢ G ∧ F . (as the syntactic objects), one works with an equivalent object, namely sets of
The last rule discussed above captures case distinction (two cases). It states clauses.
that if one knows that F or G is true and that each of them implies H, then we Recall (Definition 6.25) that a literal is an atomic formula or the negation of
can conclude H. Such a proof step is in a sense non-constructive because it may an atomic formula. For example A and ¬B are literals.
not be known which of F or G is true.
To begin a derivation from the empty set of formulas, one can use any rule Definition 6.28. A clause is a set of literals.
of the form ⊢ F , where F is a tautology. The best-known such rule is
⊢ F ∨ ¬F Example 6.15. {A, ¬B, ¬D} and {B, C, ¬C, ¬D, E} are clauses, and the empty
set ∅ is also a clause.
called “tertium non datur (TND)” (in English: “there is no third [alternative]”),
which captures the fact that a formula F can only be true or false (in which Definition 6.29. The set of clauses associated to a formula
case ¬F is true); there is no option in between.46 Another rule for deriving a
F = (L11 ∨ · · · ∨ L1m1 ) ∧ · · · ∧ (Ln1 ∨ · · · ∨ Lnmn )
tautology is
⊢ ¬(F ↔ ¬F ). in CNF, denoted as K(F ), is the set
Example 6.14. The following rule can be understood as capturing the principle def
of proof by contradiction. (Why?) K(F ) = {L11 , . . . , L1m1 } , . . . , {Ln1 , . . . , Lnmn } .
{F ∨ G, ¬G} ⊢ F. The set of clauses associated with a set M = {F1 , . . . , Fk } of formulas is the
union of their clause sets:
The reader can prove the correctness as an exercise. k
def [
46 However, in so-called constructive or intuitionistic logic, this rule is not considered correct be-
K(M ) = K(Fi ).
i=1
cause its application does not require explicit knowledge of whether F or ¬F is true.
151 Chapter 6. Logic 6.5. Propositional Logic 152
The idea behind this definition is that a clause is satisfied by a truth assign- Recall that we write K ⊢Res K if K can be derived from K using a finite num-
ment if and only if it contains some literal that evaluates to true. In other words, ber of resolution steps.50
a clause stands for the disjunction (OR) of its literals. Likewise, a set K(M ) of
clauses is satisfied by a truth assignment if every clause in K(M ) is satisfied by it. Lemma 6.6. The resolution calculus is sound, i.e., if K ⊢Res K then K |= K.51
In other words, a set of clauses stands for the conjunction (AND) of the clauses.
V
The set M = {F1 , . . . , Fk } is satisfied if and only if ki=1 Fi is satisfied, i.e., if and
only if all clauses in K(M ) are satisfied. Note that the empty clause corresponds to Proof. We only need to show that the resolution rule is correct, i.e., that if K is a
an unsatisfiable formula and the empty set of clauses corresponds to a tautology. resolvent of clauses K1 , K2 ∈ K, then K is logical consequence of {K1 , K2 }, i.e.,
Note that for a given formula (not necessarily in CNF) there are many equiv-
{K1 , K2 } ⊢res K =⇒ {K1 , K2 } |= K.
alent formulas in CNF and hence many equivalent sets of clauses. Conversely,
to a given set K of clauses one can associate many formulas which are, how- Let A be an arbitrary truth assignment suitable for {K1 , K2 } (and hence also for
ever, all equivalent. Therefore, one can naturally think of a set of clauses as a K). Recall that A is a model for {K1 , K2 } if and only if A makes at least one
(canonical) formula, and the notions of satisfiability, equivalence, and logical
literal in K1 true and also makes at least one literal in K2 true.
consequence carry over immediately from formulas to clause sets.
We refer to Definition 6.30 and distinguish two cases. If A(L) = 1, then
Definition 6.30. A clause K is a resolvent of clauses K1 and K2 if there is a literal A makes at least one literal in K2 \ {¬L} true (since ¬L is false). Similarly, if
L such that L ∈ K1 , ¬L ∈ K2 , and47 A(L) = 0, then A makes at least one literal in K1 \ {L} true (since L is false).
Because one of the two cases occurs, A makes at least one literal in K = (K1 \
K = K1 \ {L} ∪ K2 \ {¬L} . (6.1) {L}) ∪ (K2 \ {¬L}) true, which means that A is a model for K.
Example 6.16. The clauses {A, ¬B, ¬C} and {¬A, C, D, ¬E} have two resol- The goal of a derivation in the resolution calculus is to derive the empty
vents: If A is eliminated, we obtain the clause {¬B, ¬C, C, D, ¬E}, and if C clause ∅ by an appropriate sequence of resolution steps. The following theorem
is eliminated, we obtain the clause {A, ¬B, ¬A, D, ¬E}. Note that clauses are states that the resolution calculus is complete with respect to the task of proving
sets and we can write the elements in arbitrary order. In particular, we could unsatisfiability.
write the latter clause as {A, ¬A, ¬B, D, ¬E}.
Theorem 6.7. A set M of formulas is unsatisfiable if and only if K(M ) ⊢Res ∅.
It is important to point out that resolution steps must be carried out one by
one; one cannot perform two steps at once. For instance, in the above exam-
ple, {¬B, D, ¬E} is not a resolvent and can also not be obtained by two res- Proof. The “if” part (soundness) follows from Lemma 6.6: If K(M ) ⊢Res ∅,
olution steps, even though {¬B, D, ¬E} would result from {A, ¬B, ¬C} and then K(M ) |= ∅, i.e., every model for K(M ) is a model for ∅. Since ∅ has no
{¬A, C, D, ¬E} by eliminating A and ¬C from the first clause and ¬A and C model, K(M ) also does not have a model. This means that K(M ) is unsatisfiable.
from the second clause.48 It remains to prove the “only if” part (completeness with respect to unsatis-
Given a set K of clauses, a resolution step takes two clauses K1 ∈ K and fiability). We need to show that if a clause set K is unsatisfiable, then ∅ can be
K2 ∈ K, computes a resolvent K, and adds K to K. To be consistent with derived by some sequence of resolution steps. The proof is by induction over
Section 6.4.2, one can write the resolution rule (6.1) as follows:49 the number n of atomic formulas appearing in K. The induction basis (for n = 1)
is as follows. A clause set K involving only literals A1 and ¬A1 is unsatisfiable
{K1 , K2 } ⊢res K, if and only if it contains the clauses {A1 } and {¬A1 }. One can derive ∅ exactly
where equation (6.1) must be satisfied. The resolution calculus, denoted Res, if this is the case.
consists of a single rule: For the induction step, suppose that for every clause set K′ with n atomic
Res = {res}. formulas, K′ is unsatisfiable if and only if K′ ⊢Res ∅. Given an arbitrary
47 For a literal L, ¬L is the negation of L, for example if L = ¬A, then ¬L = A. 50 In the lecture we introduce a natural graphical notation for writing a sequence of resolution
48 A simpler example illustrating this is that {{A, B}, {¬A, ¬B}} is satisfiable, but a “double” steps.
resolution step would falsely yield ∅, indicating that {{A, B}, {¬A, ¬B}} is unsatisfiable. 51 For convenience, the clause K is understood to mean the singleton clause set {K}. In other
49 In the literature, one usually does not use the symbol ⊢ in the context of resolution. words, the truth value of a clause K is understood to be the same the truth value of {K}.
153 Chapter 6. Logic 6.6. Predicate Logic (First-order Logic) 154
clause set K for the atomic formulas A1 , . . . , An+1 , define the two clause sets K0 6.6 Predicate Logic (First-order Logic)
and K1 as follows. K0 is the clause set for atomic formulas A1 , . . . , An obtained
from K by setting An+1 = 0, i.e., We also refer to Section 2.4 where some basics of predicate logic were introduced
informally. Predicate logic is an extension of propositional logic, i.e., proposi-
• by eliminating all clauses from K containing ¬An+1 (which are satisfied tional logic is embedded in predicate logic as a special case.
since ¬An+1 = 1), and
• by eliminating from each remaining clause the literal An+1 if it appears in 6.6.1 Syntax
it (since having An+1 in it can not contribute to the clause being satisfied).
Definition 6.31. (Syntax of predicate logic.)
K is satisfiable under the constraint An+1 = 0 if and only if K0 is satisfiable.
• A variable symbol is of the form xi with i ∈ N.52
Analogously, K1 is obtained from K by eliminating all clauses containing (k)
An+1 and by eliminating from each remaining clause the literal ¬An+1 if it ap- • A function symbol is of the form fi with i, k ∈ N, where k denotes the
pears in it. K is satisfiable under the constraint An+1 = 1 if and only if K1 is number of arguments of the function. Function symbols for k = 0 are
satisfiable. called constants.
(k)
If K is unsatisfiable, it is unsatisfiable both for An+1 = 0 and for An+1 = 1, • A predicate symbol is of the form Pi with i, k ∈ N, where k denotes the
i.e., both K0 and K1 are unsatisfiable. Therefore, by the induction hypothesis, number of arguments of the predicate.
we have K0 ⊢Res ∅ and K1 ⊢Res ∅. Now imagine that the same resolution • A term is defined inductively: A variable is a term, and if t1 , . . . , tk are
steps leading from K0 to ∅ are carried out on K, i.e., with An+1 . This derivation (k)
terms, then fi (t1 , . . . , tk ) is a term. For k = 0 one writes no parentheses.
may or may not involve clauses (of K) that contain An+1 . In the latter case (i.e.,
An+1 not contained), the derivation of ∅ from K0 is also a derivation of ∅ from • A formula is defined inductively:
K, and in the other case it corresponds to a derivation of {An+1 } from K. (k)
– For any i and k, if t1 , . . . , tk are terms, then Pi (t1 , . . . , tk ) is a for-
Analogously, the derivation of ∅ from K1 corresponds to a derivation of ∅ mula, called an atomic formula.
from K or to a derivation of {¬An+1 } from K. – If F and G are formulas, then ¬F , (F ∧ G), and (F ∨ G) are formulas.
If in any of the two cases we have a derivation of ∅ from K, we are done – If F is a formula, then, for any i, ∀xi F and ∃xi F are formulas.
(since ∅ can be derived from K, i.e., K ⊢Res ∅). If this is not the case, then
we have a derivation of {An+1 } from K, i.e., K ⊢Res {An+1 } as well as a ∀ is called the universal quantifier, and ∃ is called the existential quantifier.
derivation of {¬An+1 } from K, i.e., K ⊢Res {¬An+1 }. From these two clauses A formula constructed according to this inductive definition corresponds
one can derive ∅ by a final resolution step. This completes the proof. naturally to a tree where the leaves correspond to terms and the inner nodes
correspond to the logical operators and the quantifiers.
To simplify notation, one usually uses function symbols f, g, h, where the
number of arguments is implicit, and for constants one uses the symbols a, b, c.
Similarly, one uses predicate symbols P, Q, R, where the number of arguments is
implicit. Moreover, one uses variable names x, y, z instead of xi , and sometimes
also u, v, w or k, m, n. To avoid confusion one can also use (∀x F ) and (∃x F )
instead of ∀x F and ∃x F .
Note that the same variable can occur bound and free in a formula. One can We instantiate Definition 6.7 for predicate logic:
draw the construction tree (see lecture) of a formula showing how a formula is
constructed according to the rules of Definition 6.31. Within the subtree corre- Definition 6.35. A interpretation (structure) A is suitable for a formula F if it
sponding to ∀x or ∃x, all occurrences of x are bound. defines all function symbols, predicate symbols, and freely occurring variables
of F .
Example 6.17. In the formula
Example 6.19. For the formula
F = Q(x) ∨ ∀y P (f (x, y)) ∧ ∃x R(x, y) ,
F = ∀x P (x) ∨ P (f (x, a)) ,
the first two occurrences of x are free, the other occurrences are bound. The last
occurrence of y is free, the other occurrences are bound. a suitable structure A is given by U A = N, by aA = 3 and f A (x, y) = x + y,
and by letting P A be the “evenness” predicate (i.e., P A (x) = 1 if and only if x is
even). For obvious reasons, we will say (see below) that the formula evaluates
Definition 6.33. For a formula F , a variable x and a term t, F [x/t] denotes the to true for this structure.
formula obtained from F by substituting every free occurrence of x by t.
Another suitable structure A for F is defined by U A = R, aA = 2, f A (x, y) =
xy and by P A (x) = 1 if and only if x ≥ 0 (i.e., P A is the “positiveness” predi-
Example 6.18. For the formula F of Example 6.17 we have cate). For this structure, F evaluates to false (since, for example, x = −2 makes
P (x) and P (f (x, a)) = P (2x) false).
F [x/g(a, z)] = Q(g(a, z)) ∨ ∀y P (f (g(a, z), y)) ∧ ∃x R(x, y) .
The semantics of a formula is now defined in the natural way as already
implicitly discussed in Section 2.4.
6.6.3 Semantics
Recall Definitions 6.5 and 6.6. In predicate logic, the free symbols of a formula are all Definition 6.36. (Semantics.) For an interpretation (structure) A = (U, φ, ψ, ξ),
predicate symbols, all function symbols, and all occurrences of free variables. An inter- we define the value (in U ) of terms and the truth value of formulas under that
pretation, called structure in the context of predicate logic, must hence define a structure.
universe and the meaning of all these free symbols. • The value A(t) of a term t is defined recursively as follows:
Definition 6.34. An interpretation or structure is a tuple A = (U, φ, ψ, ξ) where – If t is a variable, i.e., t = xi , then A(t) = ξ(xi ).
– If t is of the form f (t1 , . . . , tk ) for terms t1 , . . . , tk and a k-ary function
• U is a non-empty universe, symbol f , then A(t) = φ(f )(A(t1 ), . . . , A(tk )).
• φ is a function assigning to each function symbol (in a certain subset of all
function symbols) a function, where for a k-ary function symbol f , φ(f ) • The truth value of a formula F is defined recursively by Definition 6.16
is a function U k → U . and
• ψ is a function assigning to each predicate symbol (in a certain subset of – If F is of the form F = P (t1 , . . . , tk ) for terms t1 , . . . , tk and a k-ary
all predicate symbols) a function, where for a k-ary predicate symbol P , predicate symbol P , then A(F ) = ψ(P )(A(t1 ), . . . , A(tk )).
ψ(P ) is a function U k → {0, 1}, and where – If F is of the form ∀x G or ∃x G, then let A[x→u] for u ∈ U be the same
structure as A except that ξ(x) is overwritten by u (i.e., ξ(x) = u):
• ξ is a function assigning to each variable symbol (in a certain subset of all
variable symbols) a value in U . 1 if A[x→u] (G) = 1 for all u ∈ U
A(∀x G) =
0 else
For notational convenience, for a structure A = (U, φ, ψ, ξ) and a function
symbol f one usually writes f A instead of φ(f ). Similarly, one writes P A instead 1 if A[x→u] (G) = 1 for some u ∈ U
A(∃x G) =
of ψ(P ) and xA instead of ξ(x). One also writes U A rather than U to make A 0 else.
explicit.
This definition defines the function σ(F, A) of Definition 6.8. Note that the
54 German: geschlossen definition is recursive not only on formulas (see the second bullet of the defini-
157 Chapter 6. Logic 6.6. Predicate Logic (First-order Logic) 158
tion), but also on structures. Namely, A(∀x G) and A(∃x G) are defined in terms Recall that the definition of the semantics of a formula ∀x G for a structure
of all structures A[x→u] (G) for u ∈ U . To evaluate the truth value of a formula A is that, for all u ∈ U , A[x→u] (G) = 1.
F = ∀x G one needs to apply Definition 6.36 recursively, for formula G and all To prove the first direction, i.e., (∀x F ) ∧ H |= ∀x (F ∧ H), suppose that
structures A[x→u] . A (∀x F ) ∧ H = 1 and hence55 that (i) A(∀x F ) = 1 and that (ii) A(H) = 1.
The basic concepts discussed in Section 6.3 such as satisfiable, tautology, Recall that (i) means that A[x→u] (F ) = 1 for all u ∈ U , and (ii) means that
model, logical consequence, and equivalence, are now immediately instantiated A[x→u] (H) = 1 for all u ∈ U (since x does not occur free in H and hence
for predicate logic. A[x→u] (H) = A(H)). Therefore A[x→u] (F ∧ H) = 1 for all u ∈ U , which means
Note that the syntax of predicate logic does not require nested quantified that A(∀x (F ∧ H)) = 1, which was to be proved.
variables in a formula to be distinct, but we will avoid such overload of variable To prove the other direction, i.e. ∀x (F ∧ H) |= (∀x F ) ∧ H, suppose that
names to avoid any confusion. For example, the formula ∀x (P (x) ∨ ∃y Q(y)) is A ∀x (F ∧ H) = 1, i.e., for all u ∈ U , A[x→u] (F ∧ H) = 1, which means that (i)
equivalent to ∀x (P (x) ∨ ∃x Q(x)). A[x→u] (F ) = 1 for all u ∈ U and (ii) A[x→u] (H) = 1 for all u ∈ U . By definition,
(i) means that A(∀x F ) = 1. Moreover, because x does not occur free in H, by (ii)
we have A[x→u] (H) = A(H) = 1 for all u, which by definition means A |= H.
6.6.4 Predicate Logic with Equality Hence A |= (∀x F ) ∧ H.
Reexamining the syntax of predicate logic it may surprise that the equality sym- The following natural lemma is stated without proof.
bol “=” is not allowed. For example, ∃x f (x) = g(x) is not a formula. However,
one can extend the syntax and the semantics of predicate logic to include the Lemma 6.9. If one replaces a sub-formula G of a formula F by an equivalent (to G)
equality symbol “=” with its usual meaning. This is left as an exercise. formula H, then the resulting formula is equivalent to F .
Proof. We only prove statement 7). The other proofs are analogous. Therefore ∀x G is true for exactly the same structures for which ∀y G[x/y] is
We have to show that every structure A that is a model for (∀x F ) ∧ H is true.
also a model for ∀x (F ∧ H), and vice versa. 55 according to the semantics of ∧, see Definition 6.36
159 Chapter 6. Logic 6.6. Predicate Logic (First-order Logic) 160
Example 6.21. The formula ∀x ∃y (P (x, f (y)) ∨ Q(g(x), a)) is equivalent to the Proof. One first transforms the formula into an equivalent formula in rectified
formula ∀u ∃v (P (u, f (v)) ∨ Q(g(u), a)) obtained by substituting x by u and y form and then applies the equivalences of Lemma 6.8 move up all quantifiers in
by v. the formula tree, resulting in a prenex form of the formula.
Example 6.22.
Definition 6.37. A formula in which no variable occurs both as a bound and
as a free variable and in which all variables appearing after the quantifiers are
distinct is said to be in rectified56 form. ¬ ∀x P (x, y) ∧ ∃y Q(x, y, z) ≡ ¬ ∀u P (u, y) ∧ ∃v Q(x, v, z)
≡ ¬∀u P (u, y) ∨ ¬∃v Q(x, v, z)
By appropriately renaming quantified variables one can transform any for- (1)
mula into an equivalent formula in rectified form. ≡ ∃u ¬P (u, y) ∨ ¬∃v Q(x, v, z)
(2)
≡ ∃u ¬P (u, y) ∨ ∀v ¬ Q(x, v, z)
6.6.7 Universal Instantiation (10)
≡ ∃u ¬P (u, y) ∨ ∀v ¬ Q(x, v, z)
The rule of universal instantiation, also called universal elimination, states that for ≡ ∃u ∀v ¬ Q(x, v, z) ∨ ¬P (u, y)
any formula F and any term t, one can derive from the formula ∀xF the formula (8)
≡ ∃u ∀v ¬Q(x, v, z) ∨ ¬P (u, y)
F [x/t], thus eliminating the quantifier ∀.57 This rule is justified by the following
lemma. ≡ ∃u ∀v ¬Q(x, v, z) ∨ ¬P (u, y)
≡ ∃u ∀v ¬P (u, y) ∨ ¬ Q(x, v, z) .
Lemma 6.11. For any formula F and any term t we have
∀xF |= F [x/t]. In the first step we have renamed the bound variables, in the second step we
made use of the equivalence ¬(F ∧ G) ≡ ¬F ∨ ¬G (Lemma 6.1 8)), and then we
There is also a rule called existential instantiation, but it is more complicated have applied the rules of Lemma 6.8, as indicated. We have also made explicit
to explain. the use of the commutative law for ∨ (Lemma 6.1 2)). In the second last step,
the removal of parentheses is made explicit. The last step, again making use
of Lemma 6.1 2), is included (only) to arrive at a form with the same order of
6.6.8 Normal Forms occurrence of P and Q.
It is often useful to transform a formula into an equivalent formula of a specific One can also transform every formula F into a formula G in prenex form that
form, called a normal form. This is analogous to the conjunctive and disjunctive only contains universal quantifiers (∀). However, such a formula is in general
normal forms for formulas in propositional logic. not equivalent to F , but only equivalent with respect to satisfiability. In other
words, F is satisfiable if and only if G is satisfiable. Such a normal form is called
Definition 6.38. A formula of the form Skolem normal form. This topic is beyond the scope of this course.
Q1 x1 Q2 x2 · · · Qn xn G,
where the Qi are arbitrary quantifiers (∀ or ∃) and G is a formula free of quanti- 6.6.9 An Example Theorem and its Interpretations
fiers, is said to be in prenex form58 .
The following apparently innocent theorem is a powerful statement from which
several important corollaries follow as special cases. The example illustrates
Theorem 6.12. For every formula there is an equivalent formula in prenex form. that one can prove a general theorem in predicate logic and, because it is a tau-
tology, it can then be instantiated for different structures (i.e., interpretations),
56 German: bereinigt for each of which it is true.
57 Note that if x does not occur free in F , the statement still holds but in this case is trivial.
58 German: Pränexform Theorem 6.13. ¬∃x∀y P (y, x) ↔ ¬P (y, y) .
161 Chapter 6. Logic 6.6. Predicate Logic (First-order Logic) 162
with the chapter on set theory where sets were denoted by capital letters and Russel’s proposed set a given input, would require a more general theorem than Theorem 6.13, but it could be explained
was called R. Here we have deviated from the convention to use only small letters for variables. in the same spirit.
163 Chapter 6. Logic
62 This function of course depends on the concrete programming language which determines the