Sets, Logic, and Computation
Sets, Logic, and Computation
Fall 2017
Sets, Logic, Computation
The Open Logic Project
Instigator
Richard Zach, University of Calgary
Editorial Board
Aldo Antonelli,† University of California, Davis
Andrew Arana, Université Paris I Panthénon–Sorbonne
Jeremy Avigad, Carnegie Mellon University
Walter Dean, University of Warwick
Gillian Russell, University of North Carolina
Nicole Wyatt, University of Calgary
Audrey Yap, University of Victoria
Contributors
Samara Burns, University of Calgary
Dana Hägg, University of Calgary
Sets, Logic, Computation
An Open Logic Text
Winter 2017
The Open Logic Project would like to acknowledge the generous
support of the Faculty of Arts and the Taylor Institute of Teaching
and Learning of the University of Calgary.
1 Sets 2
1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Some Important Sets . . . . . . . . . . . . . . . . 4
1.3 Subsets . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Unions and Intersections . . . . . . . . . . . . . . 6
1.5 Pairs, Tuples, Cartesian Products . . . . . . . . . 9
1.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . 11
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Relations 14
2.1 Relations as Sets . . . . . . . . . . . . . . . . . . . 14
2.2 Special Properties of Relations . . . . . . . . . . . 16
2.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Graphs . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Operations on Relations . . . . . . . . . . . . . . 22
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 24
v
CONTENTS vi
3 Functions 26
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Kinds of Functions . . . . . . . . . . . . . . . . . 28
3.3 Inverses of Functions . . . . . . . . . . . . . . . . 30
3.4 Composition of Functions . . . . . . . . . . . . . 31
3.5 Isomorphism . . . . . . . . . . . . . . . . . . . . . 32
3.6 Partial Functions . . . . . . . . . . . . . . . . . . . 33
3.7 Functions and Relations . . . . . . . . . . . . . . 34
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 35
II First-order Logic 57
11 Undecidability 215
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . 215
11.2 Enumerating Turing Machines . . . . . . . . . . . 217
11.3 The Halting Problem . . . . . . . . . . . . . . . . 219
CONTENTS ix
A Proofs 237
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . 237
A.2 Starting a Proof . . . . . . . . . . . . . . . . . . . 239
A.3 Using Definitions . . . . . . . . . . . . . . . . . . 239
A.4 Inference Patterns . . . . . . . . . . . . . . . . . . 241
A.5 An Example . . . . . . . . . . . . . . . . . . . . . 248
A.6 Another Example . . . . . . . . . . . . . . . . . . 253
A.7 Indirect Proof . . . . . . . . . . . . . . . . . . . . 255
A.8 Reading Proofs . . . . . . . . . . . . . . . . . . . 259
A.9 I can’t do it! . . . . . . . . . . . . . . . . . . . . . 261
A.10 Other Resources . . . . . . . . . . . . . . . . . . . 263
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 263
B Induction 265
B.1 Introduction . . . . . . . . . . . . . . . . . . . . . 265
B.2 Induction on N . . . . . . . . . . . . . . . . . . . 266
B.3 Strong Induction . . . . . . . . . . . . . . . . . . 269
B.4 Inductive Definitions . . . . . . . . . . . . . . . . 270
B.5 Structural Induction . . . . . . . . . . . . . . . . . 273
C Biographies 275
C.1 Georg Cantor . . . . . . . . . . . . . . . . . . . . 275
C.2 Alonzo Church . . . . . . . . . . . . . . . . . . . 276
C.3 Gerhard Gentzen . . . . . . . . . . . . . . . . . . 277
C.4 Kurt Gödel . . . . . . . . . . . . . . . . . . . . . . 279
C.5 Emmy Noether . . . . . . . . . . . . . . . . . . . 281
C.6 Bertrand Russell . . . . . . . . . . . . . . . . . . . 282
C.7 Alfred Tarski . . . . . . . . . . . . . . . . . . . . . 284
C.8 Alan Turing . . . . . . . . . . . . . . . . . . . . . 285
CONTENTS x
Glossary 291
Bibliography 299
xiii
PREFACE xiv
2 The difference between the latter four is not terribly important, but
roughly: A theorem is an important result. A proposition is a result worth
recording, but perhaps not as important as a theorem. A lemma is a result we
mainly record only because we want to break up a proof into smaller, easier to
manage chunks. A corollary is a result that follows easily from a theorem or
proposition, such as an interesting special case.
PART I
Sets,
Relations,
Functions
1
CHAPTER 1
Sets
1.1 Basics
Sets are the most fundamental building blocks of mathematical
objects. In fact, almost every mathematical object can be seen as
a set of some kind. In logic, as in other parts of mathematics,
sets and set-theoretical talk is ubiquitous. So it will be important
to discuss what sets are, and introduce the notations necessary
to talk about sets and operations on sets in a standard way.
2
1.1. BASICS 3
When we say that sets are independent of the way they are
specified, we mean that the elements of a set are all that matters.
For instance, it so happens that
{Nicole, Jacob},
{x : is a niece or nephew of Richard}, and
{x : is a child of Ruth}
are also two ways of specifying the same set. In other words, all
that matters is which elements a set has. The elements of a set
are not ordered and each element occurs only once. When we
specify or describe a set, elements may occur multiple times and in
different orders, but any descriptions that only differ in the order
of elements or in how many times elements are listed describes
the same set.
1.3 Subsets
Sets are made up of their elements, and every element of a set is a
part of that set. But there is also a sense that some of the elements
of a set taken together are a “part of” that set. For instance, the
number 2 is part of the set of integers, but the set of even numbers
is also a part of the set of integers. It’s important to keep those
two senses of being part of a set separate.
Note that a set may contain other sets, not just as subsets but
as elements! In particular, a set may happen to both be an el-
ement and a subset of another, e.g., {0} ∈ {0, {0}} and also
{0} ⊆ {0, {0}}.
Extensionality gives a criterion of identity for sets: X = Y iff
every element of X is also an element of Y and vice versa. The
definition of “subset” defines X ⊆ Y precisely as the first half of
this criterion: every element of X is also an element of Y . Of
course the definition also applies if we switch X and Y : Y ⊆ X
iff every element of Y is also an element of X . And that, in turn,
is exactly the “vice versa” part of extensionality. In other words,
extensionality amounts to: X = Y iff X ⊆ Y and Y ⊆ X .
CHAPTER 1. SETS 6
℘(X ) = {Y : Y ⊆ X }
X ∪ Y = {x : x ∈ X ∨ x ∈ Y }
Figure 1.1: The union X ∪ Y of two sets is set of elements of X together with
those of Y .
Figure 1.2: The intersection X ∩ Y of two sets is the set of elements they have
in common.
X ∩ Y = {x : x ∈ X ∧ x ∈ Y }
Figure 1.3: The difference X \ Y of two sets is the set of those elements of X
which are not also elements of Y .
X \ Y = {x : x ∈ X and x < Y }.
X ∗ = {∅} ∪ X ∪ X 2 ∪ X 3 ∪ . . .
S = {x : x is a sibling of Richard}.
R = {x : x < x }
exist?
If R exists, it makes sense to ask if R ∈ R or not—it must be
either ∈ R or < R. Suppose the former is true, i.e., R ∈ R. R was
defined as the set of all sets that are not elements of themselves,
and so if R ∈ R, then R does not have this defining property of R.
But only sets that have this property are in R, hence, R cannot
be an element of R, i.e., R < R. But R can’t both be and not be
an element of R, so we have a contradiction.
Since the assumption that R ∈ R leads to a contradiction, we
have R < R. But this also leads to a contradiction! For if R < R, it
does have the defining property of R, and so would be an element
of R just like all the other non-self-containing sets. And again, it
can’t both not be and be an element of R.
Summary
A set is a collection of objects, the elements of the set. We write
x ∈ X if x is an element of X . Sets are extensional—they are
completely determined by their elements. Sets are specified by
listing the elements explicitly or by giving a property the ele-
ments share (abstraction). Extensionality means that the order
or way of listing or specifying the elements of a set doesn’t mat-
ter. To prove that X and Y are the same set (X = Y ) one has to
prove that every element of X is an element of Y and vice versa.
Important sets include the natural (N), integer (Z), rational
(Q), and real (R) numbers, but also strings (X ∗ ) and infinite
sequences (X ω ) of objects. X is a subset of Y , X ⊆ Y , if every
element of X is also one of Y . The collection of all subsets of
a set Y is itself a set, the power set ℘(Y ) of Y . We can form
the union X ∪ Y and intersection X ∩ Y of sets. An ordered
1.6. RUSSELL’S PARADOX 13
Problems
Problem 1.1. Show that there is only one empty set, i.e., show
that if X and Y are sets without members, then X = Y .
Relations
2.1 Relations as Sets
You will no doubt remember some interesting relations between
objects of some of the sets we’ve mentioned. For instance, num-
bers come with an order relation < and from the theory of whole
numbers the relation of divisibility without remainder (usually writ-
ten n | m) may be familar. There is also the relation is identical
with that every object bears to itself and to no other thing. But
there are many more interesting relations that we’ll encounter,
and even more possible relations. Before we review them, we’ll
just point out that we can look at relations as a special sort of set.
For this, first recall what a pair is: if a and b are two objects, we
can combine them into the ordered pair ha, bi. Note that for or-
dered pairs the order does matter, e.g, ha, bi , hb, ai, in contrast
to unordered pairs, i.e., 2-element sets, where {a, b } = {b, a}.
If X and Y are sets, then the Cartesian product X ×Y of X and
Y is the set of all pairs ha, bi with a ∈ X and b ∈ Y . In particular,
X 2 = X × X is the set of all pairs from X .
Now consider a relation on a set, e.g., the <-relation on the
set N of natural numbers, and consider the set of all pairs of
numbers hn, mi where n < m, i.e.,
R = {hn, mi : n, m ∈ N and n < m}.
Then there is a close connection between the number n being
14
2.1. RELATIONS AS SETS 15
L = {h0, 1i, h0, 2i, . . . , h1, 2i, h1, 3i, . . . , h2, 3i, h2, 4i, . . .},
is the less than relation, i.e., Lnm iff n < m. The subset of pairs
below the diagonal, i.e.,
G = {h1, 0i, h2, 0i, h2, 1i, h3, 0i, h3, 1i, h3, 2i, . . . },
CHAPTER 2. RELATIONS 16
2.3 Orders
Very often we are interested in comparisons between objects,
where one object may be less or equal or greater than another
in a certain respect. Size is the most obvious example of such a
comparative relation, or order. But not all such relations are alike
in all their properties. For instance, some comparative relations
require any two objects to be comparable, others don’t. (If they
do, we call them linear or total.) Some include identity (like ≤)
and some exclude it (like <). Let’s get some order into all this.
Example 2.14. Every linear order is also a partial order, and ev-
ery partial order is also a preorder, but the converses don’t hold.
For instance, the identity relation and the full relation on X are
preorders, but they are not partial orders, because they are not
anti-symmetric (if X has more than one element). For a some-
what less silly example, consider the no longer than relation 4
on B∗ : x 4 y iff len(x) ≤ len(y). This is a preorder, even a con-
nected preorder, but not a partial order.
The relation of divisibility without remainder gives us an ex-
ample of a partial order which isn’t a linear order: for integers
n, m, we say n (evenly) divides m, in symbols: n | m, if there is
some k so that m = kn. On N, this is a partial order, but not a
linear order: for instance, 2 - 3 and also 3 - 2. Considered as a
relation on Z, divisibility is only a preorder since anti-symmetry
fails: 1 | −1 and −1 | 1 but 1 , −1. Another important partial
order is the relation ⊆ on a set of sets.
Notice that the examples L and G from Example 2.2, although
we said there that they were called “strict orders” are not linear
orders even though they are connected (they are not reflexive).
But there is a close connection, as we will see momentarily.
2. Exercise.
2.4 Graphs
A graph is a diagram in which points—called “nodes” or “ver-
tices” (plural of “vertex”)—are connected by edges. Graphs are
a ubiquitous tool in descrete mathematics and in computer sci-
ence. They are incredibly useful for representing, and visualizing,
relationships and structures, from concrete things like networks
of various kinds to abstract structures such as the possible out-
comes of decisions. There are many different kinds of graphs in
the literature which differ, e.g., according to whether the edges
are directed or not, have labels or not, whether there can be edges
from a node to the same node, multiple edges between the same
nodes, etc. Directed graphs have a special connection to relations.
1 2 4
1 2
3. The restriction R Y of R to Y is R ∩ Y 2
Summary
A relation R on a set X is a way of relating elements of X . We
write Rxy if the relation holds between x and y. Formally, we can
CHAPTER 2. RELATIONS 24
Problems
Problem 2.1. List the elements of the relation ⊆ on the set
℘({a, b, c }).
Functions
3.1 Basics
A function is a mapping which pairs each object of a given set
with a single partner in another set. For instance, the operation
of adding 1 defines a function: each number n is paired with a
unique number n + 1. More generally, functions may take pairs,
triples, etc., of inputs and returns some kind of output. Many
functions are familiar to us from basic arithmetic. For instance,
addition and multiplication are functions. They take in two num-
bers and return a third. In this mathematical, abstract sense, a
function is a black box: what matters is only what output is paired
with what input, not the method for calculating the output.
26
3.1. BASICS 27
this case, the codomain N is not the range of f , since the natural
number 0 is not the successor of any natural number. The range
of f is the set of all positive integers, Z+ .
Figure 3.2: A surjective function has every element of the codomain as a value.
Figure 3.3: An injective function never maps two different arguments to the
same value.
Figure 3.4: A bijective function uniquely pairs the elements of the codomain
with those of the domain.
Proof. Exercise.
3.5 Isomorphism
An isomorphism is a bijection that preserves the structure of the
sets it relates, where structure is a matter of the relationships that
obtain between the elements of the sets. Consider the following
two sets X = {1, 2, 3} and Y = {4, 5, 6}. These sets are both struc-
tured by the relations successor, less than, and greater than. An
isomorphism between the two sets is a bijection that preserves
3.6. PARTIAL FUNCTIONS 33
Summary
A function f : X → Y maps every element of the domain X
to a unique element of the codomain Y . If x ∈ X , we call the y
that f maps x to the value f (x) of f for argument x. If X is a set
of pairs, we can think of the function f as taking two arguments.
The range ran(f ) of f is the subset of Y that consists of all the
values of f .
3.7. FUNCTIONS AND RELATIONS 35
Problems
Problem 3.1. Show that if f is bijective, an inverse g of f exists,
i.e., define such a g , show that it is a function, and show that it
is an inverse of f , i.e., f (g (y)) = y and g (f (x)) = x for all x ∈ X
and y ∈ Y .
The Size of
Sets
4.1 Introduction
When Georg Cantor developed set theory in the 1870s, his inter-
est was in part to make palatable the idea of an infinite collection—
an actual infinity, as the medievals would say. Key to this reha-
bilitation of the notion of the infinite was a way to assign sizes—
“cardinalities”—to sets. The cardinality of a finite set is just a
natural number, e.g., ∅ has cardinality 0, and a set containing
five things has cardinality 5. But what about infinite sets? Do
they all have the same cardinality, ∞? It turns out, they do not.
The first important idea here is that of an enumeration. We
can list every finite set by listing all its elements. For some infinite
sets, we can also list all their elements if we allow the list itself
to be infinite. Such sets are called countable. Cantor’s surprising
result was that some infinite sets are not countable.
37
CHAPTER 4. THE SIZE OF SETS 38
−d 02 e d 12 e −d 22 e d 32 e −d 24 e d 52 e −d 62 e . . .
0 1 −1 2 −2 3 ...
0 if n = 1
f (n) = n/2
if n is even
−(n − 1)/2
if n is odd and > 1
That is fine for “easy” sets. What about the set of, say, pairs
of natural numbers?
Z+ × Z+ = {hn, mi : n, m ∈ Z+ }
1 2 3 4 ...
1 h1, 1i h1, 2i h1, 3i h1, 4i ...
2 h2, 1i h2, 2i h2, 3i h2, 4i ...
3 h3, 1i h3, 2i h3, 3i h3, 4i ...
4 h4, 1i h4, 2i h4, 3i h4, 4i ...
.. .. .. .. .. ..
. . . . . .
such an array into a one-way list? The pattern in the array below
demonstrates one way to do this:
1 2 4 7 ...
3 5 8 ... ...
6 9 ... ... ...
10 . . . . . . . . . ...
.. .. .. .. ..
. . . . .
This pattern is called Cantor’s zig-zag method. Other patterns are
perfectly permissible, as long as they “zig-zag” through every cell
of the array. By Cantor’s zig-zag method, the enumeration for
Z+ × Z+ according to this scheme would be:
h1, 1i, h1, 2i, h2, 1i, h1, 3i, h2, 2i, h3, 1i, h1, 4i, h2, 3i, h3, 2i, h4, 1i, . . .
What ought we do about enumerating, say, the set of ordered
triples of positive integers?
Z+ × Z+ × Z+ = {hn, m, k i : n, m, k ∈ Z+ }
We can think of Z+ × Z+ × Z+ as the Cartesian product of Z+ × Z+
and Z+ , that is,
(Z+ )3 = (Z+ × Z+ ) × Z+ = {hhn, mi, k i : hn, mi ∈ Z+ × Z+, k ∈ Z+ }
and thus we can enumerate (Z+ )3 with an array by labelling one
axis with the enumeration of Z+ , and the other axis with the
enumeration of (Z+ )2 :
1 2 3 4 ...
h1, 1i h1, 1, 1i h1, 1, 2i h1, 1, 3i h1, 1, 4i ...
h1, 2i h1, 2, 1i h1, 2, 2i h1, 2, 3i h1, 2, 4i ...
h2, 1i h2, 1, 1i h2, 1, 2i h2, 1, 3i h2, 1, 4i ...
h1, 3i h1, 3, 1i h1, 3, 2i h1, 3, 3i h1, 3, 4i ...
.. .. .. .. .. ..
. . . . . .
Thus, by using a method like Cantor’s zig-zag method, we may
similarly obtain an enumeration of (Z+ )3 .
4.3. UNCOUNTABLE SETS 43
1 2 3 4 ...
1 s1 (1) s 1 (2) s 1 (3) s 1 (4) ...
2 s 2 (1) s2 (2) s 2 (3) s 2 (4) ...
3 s 3 (1) s 3 (2) s3 (3) s 3 (4) ...
4 s 4 (1) s 4 (2) s 4 (3) s4 (4) ...
.. .. .. .. .. ..
. . . . . .
The labels down the side give the number of the sequence in the
list s 1 , s 2 , . . . ; the numbers across the top label the elements of the
individual sequences. For instance, s 1 (1) is a name for whatever
number, a 0 or a 1, is the first element in the sequence s 1 , and so
on.
Now we construct an infinite sequence, s , of 0’s and 1’s which
cannot possibly be on this list. The definition of s will depend on
the list s 1 , s 2 , . . . . Any infinite list of infinite sequences of 0’s and
1’s gives rise to an infinite sequence s which is guaranteed to not
appear on the list.
To define s , we specify what all its elements are, i.e., we spec-
ify s (n) for all n ∈ Z+ . We do this by reading down the diagonal
of the array above (hence the name “diagonal method”) and then
changing every 1 to a 0 and every 1 to a 0. More abstractly, we
define s (n) to be 0 or 1 according to whether the n-th element of
4.3. UNCOUNTABLE SETS 45
Proof. We proceed in the same way, by showing that for every list
of subsets of Z+ there is a subset of Z+ which cannot be on the
list. Suppose the following is a given list of subsets of Z+ :
Z 1, Z 2, Z 3, . . .
We now define a set Z such that for any n ∈ Z+ , n ∈ Z iff n < Z n :
Z = {n ∈ Z+ : n < Z n }
Z is clearly a set of positive integers, since by assumption each Z n
is, and thus Z ∈ ℘(Z+ ). But Z cannot be on the list. To show
this, we’ll establish that for each k ∈ Z+ , Z , Zk .
So let k ∈ Z+ be arbitrary. We’ve defined Z so that for any
n ∈ Z+ , n ∈ Z iff n < Z n . In particular, taking n = k , k ∈ Z
iff k < Zk . But this shows that Z , Zk , since k is an element of
one but not the other, and so Z and Zk have different elements.
Since k was arbitrary, Z is not on the list Z 1 , Z 2 , . . .
The preceding proof did not mention a diagonal, but you
can think of it as involving a diagonal if you picture it this way:
Imagine the sets Z 1 , Z 2 , . . . , written in an array, where each ele-
ment j ∈ Zi is listed in the j -th column. Say the first four sets on
that list are {1, 2, 3, . . . }, {2, 4, 6, . . . }, {1, 2, 5}, and {3, 4, 5, . . . }.
Then the array would begin with
Z1 = {1, 2, 3, 4, 5, 6, . . . }
Z2 = { 2, 4, 6, . . . }
Z3 = {1, 2, 5 }
Z4 = { 3, 4, 5, 6, . . . }
.. ..
. .
4.4 Reduction
We showed ℘(Z+ ) to be uncountable by a diagonalization argu-
ment. We already had a proof that Bω , the set of all infinite
sequences of 0s and 1s, is uncountable. Here’s another way we
can prove that ℘(Z+ ) is uncountable: Show that if ℘(Z+ ) is count-
able then Bω is also countable. Since we know Bω is not count-
able, ℘(Z+ ) can’t be either. This is called reducing one problem
to another—in this case, we reduce the problem of enumerat-
ing Bω to the problem of enumerating ℘(Z+ ). A solution to the
latter—an enumeration of ℘(Z+ )—would yield a solution to the
former—an enumeration of Bω .
How do we reduce the problem of enumerating a set Y to
that of enumerating a set X ? We provide a way of turning an
enumeration of X into an enumeration of Y . The easiest way to
do that is to define a surjective function f : X → Y . If x 1 , x 2 , . . .
enumerates X , then f (x 1 ), f (x 2 ), . . . would enumerate Y . In our
case, we are looking for a surjective function f : ℘(Z+ ) → Bω .
f (Z 1 ), f (Z 2 ), f (Z 3 ), . . .
Y = {x ∈ X : x < g (x)}.
CHAPTER 4. THE SIZE OF SETS 52
Summary
The size of a set X can be measured by a natural number if
the set is finite, and sizes can be compared by comparing these
numbers. If sets are infinite, things are more complicated. The
first level of infinity is that of countably infinite sets. A set X is
countable if its elements can be arranged in an enumeration, a
one-way infinite, possibly gappy list, i.e., when there is a surjective
function f : Z+ → 7 X . It is countably infinite if it is countable but
not finite. Cantor’s zig-zag method shows that the sets of pairs
of elements of countably infinite sets is also countable; and this
can be used to show that even the set of rational numbers Q is
countable.
4.6. COMPARING SIZES OF SETS 53
There are, however, infinite sets that are not countable: these
sets are called uncountable. There are two ways of showing that
a set is uncountable: directly, using a diagonal argument, or
by reduction. To give a diagonal argument, we assume that the
set X in question is countable, and use a hypothetical enumera-
tion to define an element of X which, by the very way we define
it, is guaranteed to be different from every element in the enu-
meration. So the enumeration can’t be an enumeration of all
of X after all, and we’ve shown that no enumeration of X can
exist. A reduction shows that X is uncountable by associating
every element of X with an element of some known uncountable
set Y in a surjective way. If this is possible, than a hypothetical
enumeration of X would yieled an enumeration of Y . Since Y is
uncountable, no enumeration of X can exist.
In general, infinite sets can be compared sizewise: X and
Y are the same size, or equinumerous, if there is a bijection
between them. We can also define that X is no larger than Y
(|X | ≤ |Y |) if there is an injective function from X to Y . By
the Schröder-Bernstein Theorem, this in fact provides a sizewise
order of infinite sets. Finally, Cantor’s theorem says that for
any X , |X | < |℘(X )|. This is a generalization of our result that
℘(Z+ ) is uncountable, and shows that there are not just two, but
infinitely many levels of infinity.
Problems
Problem 4.1. According to Definition 4.4, a set X is enumerable
iff X = ∅ or there is a surjective f : Z+ → X . It is also possible to
define “countable set” precisely by: a set is enumerable iff there
is an injective function g : X → Z+ . Show that the definitions are
equivalent, i.e., show that there is an injective function g : X →
Z+ iff either X = ∅ or there is a surjective f : Z+ → X .
Problem 4.9. Show that the set of all finite subsets of an arbitrary
infinite enumerable set is enumerable.
Problem 4.15. Show that the set of all sets of pairs of positive
integers is uncountable by a reduction argument.
Problem 4.17. Let P be the set of functions from the set of posi-
tive integers to the set {0}, and let Q be the set of partial functions
from the set of positive integers to the set {0}. Show that P is
countable and Q is not. (Hint: reduce the problem of enumerat-
ing Bω to enumerating Q ).
Problem 4.19. Show that the set R of all real numbers is un-
countable.
First-order
Logic
57
CHAPTER 5
Syntax and
Semantics
5.1 Introduction
In order to develop the theory and metatheory of first-order logic,
we must first define the syntax and semantics of its expressions.
The expressions of first-order logic are terms and formulas. Terms
are formed from variables, constant symbols, and function sym-
bols. Formulas, in turn, are formed from predicate symbols to-
gether with terms (these form the smallest, “atomic” formulas),
and then from atomic formulas we can form more complex ones
using logical connectives and quantifiers. There are many dif-
ferent ways to set down the formation rules; we give just one
possible one. Other systems will chose different symbols, will se-
lect different sets of connectives as primitive, will use parentheses
differently (or even not at all, as in the case of so-called Polish
notation). What all approaches have in common, though, is that
the formation rules define the set of terms and formulas induc-
tively. If done properly, every expression can result essentially
in only one way according to the formation rules. The induc-
tive definition resulting in expressions that are uniquely readable
means we can give meanings to these expressions using the same
58
5.1. INTRODUCTION 59
method—inductive definition.
1. Logical symbols
1. ⊥ is an atomic formula.
2. A ↔ B abbreviates (A → B) ∧ (B → A).
these are just conventional abbreviations for A20 (t1, t2 ), f02 (t1, t2 ),
A20 (t1, t2 ) and f01 (t ), respectively.
1. We take D to be A and D → D to be B.
2. We take A to be D → D and B is D.
Proof. Exercise.
1. A ≡ ⊥.
Proof. Exercise.
1. A is atomic.
3. A is of the form (B ∧ C ).
4. A is of the form (B ∨ C ).
5. A is of the form (B → C ).
6. A is of the form ∀x B.
7. A is of the form ∃x B.
5.6 Subformulas
It is often useful to talk about the formulas that “make up” a
given formula. We call these its subformulas. Any formula counts
as a subformula of itself; a subformula of A other than A itself is
a proper subformula.
B is the scope of the first ∀v0 , C is the scope of ∃v1 , and D is the
scope of the second ∀v0 . The first ∀v0 binds the occurrences of v0
in B, ∃v1 the occurrence of v1 in C , and the second ∀v0 binds the
occurrence of v0 in D. The first occurrence of v1 and the fourth
occurrence of v0 are free in A. The last occurrence of v0 is free
in D, but bound in C and A.
5.8 Substitution
1. s ≡ c : s [t /x] is just s .
3. s ≡ x: s [t /x] is t .
CHAPTER 5. SYNTAX AND SEMANTICS 74
Example 5.24.
1. A ≡ ⊥: A[t /x] is ⊥.
1. |N| = N
2. N = 0
ValM (t ) = f M
(ValM (t1 ), . . . , ValM (tn )).
1. t ≡ c : ValsM (t ) = c M .
2. t ≡ x: ValsM (t ) = s (x).
3. t ≡ f (t1, . . . , tn ):
ValsM (t ) = f M
(ValsM (t1 ), . . . , ValsM (tn )).
1. A ≡ ⊥: M, s 6 |= A.
4. A ≡ ¬B: M, s |= A iff M, s 6 |= B.
5. A ≡ (B ∧ C ): M, s |= A iff M, s |= B and M, s |= C .
5.11. SATISFACTION OF A FORMULA IN A STRUCTURE 81
7. A ≡ (B → C ): M, s |= A iff M, s 6 |= B or M, s |= C (or
both).
1. |M| = {1, 2, 3, 4}
2. a M = 1
3. b M = 2
Then
ValsM (f (a, b)) = f M
(ValsM (a), ValsM (b)).
since 3 + 1 > 3. Since s (x) = 1 and ValsM (x) = s (x), we also have
1. A ≡ ⊥: both M, s 1 6 |= A and M, s 2 6 |= A.
CHAPTER 5. SYNTAX AND SEMANTICS 86
so M, s 2 |= t1 = t2 .
2. A ≡ B ∧ C : exercise.
3. A ≡ B ∨ C : if M, s 1 |= A, then M, s 1 |= B or M, s 1 |= C . By
induction hypothesis, M, s 2 |= B or M, s 2 |= C , so M, s 2 |= A.
4. A ≡ B → C : exercise.
5.12. VARIABLE ASSIGNMENTS 87
5. A ≡ ∃x B: if M, s 1 |= A, there is an x-variant s 10 of s 1 so
that M, s10 |= B. Let s 20 be the x-variant of s 2 that assigns
the same thing to x as does s 10 . The free variables of B are
among x 1 , . . . , x n , and x. s 10 (x i ) = s 20 (x i ), since s 10 and s 20
are x-variants of s 1 and s2 , respectively, and by hypothesis
s1 (x i ) = s 2 (x i ). s 10 (x) = s 20 (x) by the way we have defined s 20 .
Then the induction hypothesis applies to B and s 10 , s 20 , so
M, s 20 |= B. Hence, there is an x-variant of s 2 that satisfies B,
and so M, s 2 |= A.
6. A ≡ ∀x B: exercise.
Proof. Exercise.
Proof. Exercise.
5.13 Extensionality
Extensionality, sometimes called relevance, can be expressed in-
formally as follows: the only thing that bears upon the satisfaction
of formula A in a structure M relative to a variable assignment s ,
are the assignments made by M and s to the elements of the
language that actually appear in A.
One immediate consequence of extensionality is that where
two structures M and M 0 agree on all the elements of the lan-
guage appearing in a sentence A and have the same domain, M
and M 0 must also agree on whether or not A itself is true.
5.13. EXTENSIONALITY 89
Proof. First prove (by induction on t ) that for every term, ValsM1 (t ) =
ValsM2 (t ). Then prove the proposition by induction on A, making
use of the claim just proved for the induction basis (where A is
atomic).
Proof. By induction on t .
ValsM (t [t 0/x]) =
= ValsM (f (t1 [t 0/x], . . . , tn [t 0/x]))
by definition of t [t 0/x]
=f M
(ValsM (t1 [t 0/x]), . . . , ValsM (tn [t 0/x]))
by definition of ValsM (f (. . . ))
=f M
(ValsM0 (t1 ), . . . , ValsM0 (tn ))
by induction hypothesis
= ValsM0 (t ) by definition of ValsM0 (f (. . . ))
Proof. Exercise.
1. A(t ) ∃x A(x)
2. ∀x A(x) A(t )
2. Exercise.
5.14. SEMANTIC NOTIONS 93
Summary
A first-order language consists of constant, function, and pred-
icate symbols. Function and constant symbols take a specified
number of arguments. In the language of arithmetic, e.g., we
have a single constant symbol , one 1-place function symbol 0,
two 2-place function symbols + and ×, and one 2-place predicate
symbol <. From variables and constant and function symbols
we form the terms of a language. From the terms of a language
together with its predicate symbol, as well as the identity sym-
bol =, we form the atomic formulas. And in turn from them,
using the logical connectives ¬, ∨, ∧, →, ↔ and the quantifiers
∀ and ∃ we form its formulas. Since we are careful to always
include necessary parentheses in the process of forming terms
and formulas, there is always exactly one way of reading a for-
mula. This makes it possible to define things by induction on the
structure of formulas.
Occurrences of variables in formulas are sometimes governed
by a corresponding quantifier: if a variable occurs in the scope
of a quantifier it is considered bound, otherwise free. These
concepts all have inductive definitions, and we also inductively
define the operation of substitution of a term for a variable in
a formula. Formulas without free variable occurrences are called
sentences.
The semantics for a first-order language is given by a struc-
ture for that language. It consists of a domain and elements
of that domain are assigned to each constant symbol. Function
symbols are interpreted by functions and relation symbols by re-
lation on the domain. A function from the set of variables to the
domain is a variable assignment. The relation of satisfaction
relates structures, variable assignments and formulas; M |= [s ]A
is defined by induction on the structure of A. M |= [s ]A only
depends on the interpretation of the symbols actually occurring
in A, and in particular does not depend on s if A contains no free
variables. So if A is a sentence, M |= A if M |= [s ]A for any (or
all) s .
CHAPTER 5. SYNTAX AND SEMANTICS 94
Problems
Problem 5.1. Prove Lemma 5.10.
Problem 5.2. Prove Proposition 5.11 (Hint: Formulate and prove
a version of Lemma 5.10 for terms.)
Problem 5.3. Give an inductive definition of the bound variable
occurrences along the lines of Definition 5.17.
Problem 5.4. Is N, the standard model of arithmetic, covered?
Explain.
Problem 5.5. Let L = {c, f , A} with one constant symbol, one
one-place function symbol and one two-place predicate symbol,
and let the structure M be given by
1. |M| = {1, 2, 3}
2. c M = 3
1. A ≡ ⊥: not M ||= A.
Theories and
Their Models
6.1 Introduction
The development of the axiomatic method is a significant achieve-
ment in the history of science, and is of special importance in the
history of mathematics. An axiomatic development of a field in-
volves the clarification of many questions: What is the field about?
What are the most fundamental concepts? How are they related?
Can all the concepts of the field be defined in terms of these
fundamental concepts? What laws do, and must, these concepts
obey?
The axiomatic method and logic were made for each other.
Formal logic provides the tools for formulating axiomatic theo-
ries, for proving theorems from the axioms of the theory in a
precisely specified way, for studying the properties of all systems
satisfying the axioms in a systematic way.
97
CHAPTER 6. THEORIES AND THEIR MODELS 98
{ ∀x x ≤ x,
∀x ∀y ((x ≤ y ∧ y ≤ x) → x = y),
∀x ∀y ∀z ((x ≤ y ∧ y ≤ z ) → x ≤ z ) }
∀x ¬x < x,
∀x ∀y ((x < y ∨ y < x) ∨ x = y),
∀x ∀y ∀z ((x < y ∧ y < z ) → x < z )
∀x (x · ) = x
∀x ∀y ∀z (x · (y · z )) = ((x · y) · z )
∀x ∃y (x · y) =
¬∃x x 0 =
∀x ∀y (x 0 = y 0 → x = y)
∀x ∀y (x < y ↔ ∃z (x + z 0 = y))
∀x (x + ) = x
∀x ∀y (x + y 0) = (x + y)0
∀x (x × ) =
∀x ∀y (x × y 0) = ((x × y) + x)
Since there are infinitely many sentences of the latter form, this
axiom system is infinite. The latter form is called the induction
schema. (Actually, the induction schema is a bit more complicated
than we let on here.)
The third axiom is an explicit definition of <.
∃x ¬∃y y ∈ x
∀x ∀y (∀z (z ∈ x ↔ z ∈ y) → x = y)
∀x ∀y ∃z ∀u (u ∈ z ↔ (u = x ∨ u = y))
∀x ∃y ∀z (z ∈ y ↔ ∃u (z ∈ u ∧ u ∈ x))
∃x ∀y (y ∈ x ↔ A(y))
The first axiom says that there is a set with no elements (i.e., ∅
exists); the second says that sets are extensional; the third that
for any sets X and Y , the set {X,Y } exists; the fourth that for
any sets X and Y , the set X ∪ Y exists.
The sentences mentioned last are collectively called the naive
comprehension scheme. It essentially says that for every A(x), the set
{x : A(x)} exists—so at first glance a true, useful, and perhaps
even necessary axiom. It is called “naive” because, as it turns out,
it makes this theory unsatisfiable: if you take A(y) to be ¬y ∈ y,
you get the sentence
∃x ∀y (y ∈ x ↔ ¬y ∈ y)
∀x P (x, x),
∀x ∀y ((P (x, y) ∧ P (y, x)) → x = y),
∀x ∀y ∀z ((P (x, y) ∧ P (y, z )) → P (x, z )),
∀z (z ∈ x → z ∈ y)
∃x (¬∃y y ∈ x ∧ ∀z x ⊆ z )
∀u ((u ∈ x ∨ u ∈ y) ↔ u ∈ z )
∀u (u ⊆ x ↔ u ∈ y)
since the elements of X ∪ Y are exactly the sets that are either
elements of X or elements of Y , and the elements of ℘(X ) are
exactly the subsets of X . However, this doesn’t allow us to use
x ∪ y or ℘(x) as if they were terms: we can only use the entire
formulas that define the relations X ∪ Y = Z and ℘(X ) = Y . In
fact, we do not know that these relations are ever satisfied, i.e.,
we do not know that unions and power sets always exist. For
instance, the sentence ∀x ∃y ℘(x) = y is another axiom of ZFC
(the power set axiom).
Now what about talk of ordered pairs or functions? Here we
have to explain how we can think of ordered pairs and functions
CHAPTER 6. THEORIES AND THEIR MODELS 108
as special kinds of sets. One way to define the ordered pair hx, yi
is as the set {{x }, {x, y }}. But like before, we cannot introduce
a function symbol that names this set; we can only define the
relation hx, yi = z , i.e., {{x }, {x, y }} = z :
∀u (u ∈ z ↔ (∀v (v ∈ u ↔ v = x) ∨ ∀v (v ∈ u ↔ (v = x ∨ v = y))))
This says that the elements u of z are exactly those sets which
either have x as its only element or have x and y as its only
elements (in other words, those sets that are either identical to
{x } or identical to {x, y }). Once we have this, we can say further
things, e.g., that X × Y = Z :
∀z (z ∈ Z ↔ ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ hx, yi = z ))
∀u (u ∈ f → ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ hx, yi = u)) ∧
∀x (x ∈ X → (∃y (y ∈ Y ∧ maps(f , x, y)) ∧
(∀y ∀y 0 ((maps(f , x, y) ∧ maps(f , x, y 0)) → y = y 0)))
f : X → Y ∧ ∀x ∀x 0 ((x ∈ X ∧ x 0 ∈ X ∧
∃y (maps(f , x, y) ∧ maps(f , x 0, y))) → x = x 0)
One might think that set theory requires another axiom that
guarantees the existence of a set for every defining property. If
A(x) is a formula of set theory with the variable x free, we can
consider the sentence
∃y ∀x (x ∈ y ↔ A(x)).
This sentence states that there is a set y whose elements are all
and only those x that satisfy A(x). This schema is called the
“comprehension principle.” It looks very useful; unfortunately
it is inconsistent. Take A(x) ≡ ¬x ∈ x, then the comprehension
principle states
∃y ∀x (x ∈ y ↔ x < x),
i.e., it states the existence of a set of all sets that are not elements
of themselves. No such set can exist—this is Russell’s Paradox.
ZFC, in fact, contains a restricted—and consistent—version of
this principle, the separation principle:
∀z ∃y ∀x (x ∈ y ↔ (x ∈ z ∧ A(x)).
A ≥n ≡ ∃x 1 ∃x 2 . . . ∃x n (x 1 , x 2 ∧ x 1 , x 3 ∧ x 1 , x 4 ∧ · · · ∧ x 1 , x n ∧
x2 , x3 ∧ x2 , x4 ∧ · · · ∧ x2 , xn ∧
..
.
x n−1 , x n )
A=n ≡ ∃x 1 ∃x 2 . . . ∃x n (x 1 , x 2 ∧ x 1 , x 3 ∧ x 1 , x 4 ∧ · · · ∧ x 1 , x n ∧
x2 , x3 ∧ x2 , x4 ∧ · · · ∧ x2 , xn ∧
..
.
x n−1 , x n ∧
∀y (y = x 1 ∨ . . . y = x n ) . . . ))
Summary
Sets of sentences in a sense describe the structures in which they
are jointly true; these structures are their models. Conversely, if
we start with a structure or set of structures, we might be inter-
ested in the set of sentences they are models of, this is the theory
of the structure or set of structures. Any such set of sentences has
the property that every sentence entailed by them is already in
the set; they are closed. More generally, we call a set Γ a theory
if it is closed under entailment, and say Γ is axiomatized by ∆
is Γ consists of all sentences entailed by ∆.
Mathematics yields many examples of theories, e.g., the the-
ories of linear orders, of groups, or theories of arithmetic, e.g.,
the theory axiomatized by Peano’s axioms. But there are many
examples of important theories in other disciplines as well, e.g.,
relational databases may be thought of as theories, and meta-
physics concerns itself with theories of parthood which can be
axiomatized.
One significant question when setting up a theory for study is
whether its language is expressive enough to allow us to formu-
late everything we want the theory to talk about, and another is
whether it is strong enough to prove what we want it to prove. To
express a relation we need a formula with the requisite number
of free variables. In set theory, we only have ∈ as a relation sym-
bol, but it allows us to express x ⊆ y using ∀u (u ∈ x → u ∈ y).
Zermelo-Fraenkel set theory ZFC, in fact, is strong enough to
both express (almost) every mathematical claim and to (almost)
prove every mathematical theorem using a handful of axioms and
a chain of increasingly complicated definitions such as that of ⊆.
Problems
Problem 6.1. Find formulas in LA which define the following
relations:
1. n is between i and j ;
CHAPTER 6. THEORIES AND THEIR MODELS 112
1. the inverse R −1 of R;
1. {0} is definable in N;
2. {1} is definable in N;
3. {2} is definable in N;
∃y ∀x (x ∈ y ↔ x < x) ` ⊥.
Natural
Deduction
7.1 Introduction
Logical systems commonly have not just a semantics, but also
proof systems. The purpose of proof systems is to provide a
purely syntactic method of establishing entailment and validity.
They are purely syntactic in the sense that a derivation in such
a system is a finite syntactic object, usually a sequence (or other
finite arrangement) of formulas. Moreover, good proof systems
have the property that any given sequence or arrangement of for-
mulas can be verified mechanically to be a “correct” proof. The
simplest (and historically first) proof systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation
in such a system if each individual formula in it is either among
a fixed set of “axioms” or follows from formulas coming before it
in the sequence by one of a fixed number of “inference rules”—
and it can be mechanically verified if a formula is an axiom and
whether it follows correctly from other formulas by one of the in-
ference rules. Axiomatic proof systems are easy to describe—and
also easy to handle meta-theoretically—but derivations in them
are hard to read and understand, and are also hard to produce.
113
CHAPTER 7. NATURAL DEDUCTION 114
The rules for natural deduction are divided into two main
types: propositional rules (quantifier-free) and quantifier rules. The
rules come in pairs, an introduction and an elimination rule for
CHAPTER 7. NATURAL DEDUCTION 116
Propositional Rules
Rules for ∧
A B A∧B A∧B
∧Intro ∧Elim ∧Elim
A∧B A B
Rules for ∨
[A]n [B]n
A B
∨Intro ∨Intro
A∨B A∨B
n
A∨B C C
∨Elim
C
Rules for ¬
[A]n
⊥
n ¬Intro
¬A
Rules for →
[A]n
A→B A
→Elim
B
B
n →Intro
A→B
7.2. RULES AND DERIVATIONS 117
Rules for ⊥
⊥ ⊥
I
A
Quantifier Rules
Rules for ∀
A(a) ∀x A(x)
∀Intro ∀Elim
∀x A(x) A(t )
where t is a ground term (a term that does not contain any vari-
ables), and a is a constant symbol which does not occur in A, or
in any assumption which is undischarged in the derivation end-
ing with the premise A. We call a the eigenvariable of the ∀Intro
inference.
Rules for ∃
[A(a)]n
A(t )
∃Intro
∃x A(x)
∃x A(x) C
n ∃Elim
C
where t is a ground term, and a is a constant which does not
occur in the premise ∃x A(x), in C , or any assumption which is
undischarged in the derivations ending with the two premises C
(other than the assumptions A(a)). We call a the eigenvariable of
the ∃Elim inference.
The condition that an eigenvariable neither occur in the premises
nor in any assumption that is undischarged in the derivations
leading to the premises for the ∀Intro or ∃Elim inference is called
the eigenvariable condition.
We use the term “eigenvariable” even though a in the above
rules is a constant. This has historical reasons.
In ∃Intro and ∀Elim there are no restrictions, and the term t
can be anything, so we do not have to worry about any condi-
tions. On the other hand, in the ∃Elim and ∀Intro rules, the
CHAPTER 7. NATURAL DEDUCTION 118
[A ∧ B]1
1
A
→Intro
(A ∧ B) → A
[A ∧ B]1
∧Elim
1
A
→Intro
(A ∧ B) → A
[¬A ∨ B]1
1
A→B
→Intro
(¬A ∨ B) → (A → B)
This leaves us with two possibilities to continue. Either we
can keep working from the bottom up and look for another ap-
plication of the →Intro rule, or we can work from the top down
and apply a ∨Elim rule. Let us apply the latter. We will use the
assumption ¬A ∨ B as the leftmost premise of ∨Elim. For a valid
application of ∨Elim, the other two premises must be identical
to the conclusion A → B, but each may be derived in turn from
another assumption, namely the two disjuncts of ¬A ∨ B. So our
derivation will look like this:
[¬A]2 [B]2
3
B B
→Intro 4 →Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
1
A→B
→Intro
(¬A ∨ B) → (A → B)
For the two missing parts of the derivation, we need deriva-
tions of B from ¬A and A in the middle, and from A and B on the
7.3. EXAMPLES OF DERIVATIONS 121
left. Let’s take the former first. ¬A and A are the two premises of
¬Elim:
[¬A]2 [A]3
⊥ ¬Elim
B
By using ⊥I , we can obtain B as a conclusion and complete the
branch.
[B]2, [A]4
[¬A]2 [A]3
⊥ ⊥ ⊥Intro
I
3
B 4
B
→Intro →Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
1
A→B
→Intro
(¬A ∨ B) → (A → B)
[¬A]2 [A]3
⊥ ⊥ ¬Elim
B
I [B]2
3 →Intro →Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
1
A→B
→Intro
(¬A ∨ B) → (A → B)
[¬(A ∨ ¬A)]1
1
⊥ ⊥C
A ∨ ¬A
Now we’re looking for a derivation of ⊥ from ¬(A ∨ ¬A). Since
⊥ is the conclusion of ¬Elim we might try that:
¬A A
⊥ ¬Elim
1 ⊥C
A ∨ ¬A
Our strategy for finding a derivation of ¬A calls for an application
of ¬Intro:
2
⊥
¬Intro
¬A A
⊥ ¬Elim
1 ⊥C
A ∨ ¬A
Here, we can get ⊥ easily by applying ¬Elim to the assumption
¬(A ∨ ¬A) and A ∨ ¬A which follows from our new assumption A
by ∨Intro:
7.3. EXAMPLES OF DERIVATIONS 123
[A]2 [¬A]3
∨Intro ∨Intro
[¬(A ∨ ¬A)]1 A ∨ ¬A [¬(A ∨ ¬A)]1 A ∨ ¬A
⊥ ¬Elim ⊥ ⊥ ¬Elim
2 ¬Intro 3 C
¬A A
⊥ ¬Elim
1 ⊥C
A ∨ ¬A
¬∀x A(x)
→Intro
∃x ¬A(x) → ¬∀x A(x)
Since there is no obvious rule to apply to ¬∀x A(x), we will pro-
ceed by setting up the derivation so we can use the ∃Elim rule.
CHAPTER 7. NATURAL DEDUCTION 124
[¬A(a)]2
3
⊥
¬Intro
[∃x ¬A(x)]1 ¬∀x A(x)
2 ∃Elim
¬∀x A(x)
→Intro
∃x ¬A(x) → ¬∀x A(x)
It looks like we are close to getting a contradiction. The easiest
rule to apply is the ∀Elim, which has no eigenvariable conditions.
Since we can use any term we want to replace the universally
quantified x, it makes the most sense to continue using a so we
can reach a contradiction.
[∀x A(x)]3
∀Elim
[¬A(a)]2 A(a)
⊥ ¬Elim
3 ¬Intro
[∃x ¬A(x)]1 ¬∀x A(x)
2 ∃Elim
¬∀x A(x)
→Intro
∃x ¬A(x) → ¬∀x A(x)
7.3. EXAMPLES OF DERIVATIONS 125
∃x C (x, b)
We have two premises to work with. To use the first, i.e., try
to find a derivation of ∃x C (x, b) from ∃x (A(x) ∧ B(x)) we would
use the ∃Elim rule. Since it has an eigenvariable condition, we
will apply that rule first. We get the following:
[A(a) ∧ B(a)]1
[A(a) ∧ B(a)]1
∧Elim
B(a)
¬∀x A(x)
tradiction.
[∀x A(x)]1
1
⊥
¬Intro
¬∀x A(x)
So far so good. We can use ∀Elim but it’s not obvious if that will
help us get to our goal. Instead, let’s use one of our assumptions.
∀x A(x) → ∃y B(y) together with ∀x A(x) will allow us to use the
→Elim rule.
1
⊥
¬Intro
¬∀x A(x)
We now have one final assumption to work with, and it looks like
this will help us reach a contradiction by using ¬Elim.
Γ, [¬A]1
δ1
⊥ ⊥
C
A
7.4. PROOF-THEORETIC NOTIONS 129
Proof. Exercise.
∆1, [A]1
δ0
B
→Intro
A→B
CHAPTER 7. NATURAL DEDUCTION 130
Γ Γ
δ2 δ3
A→B A
→Elim
B
This shows Γ ` B.
Γ, [A]1
Γ
δ2
δ1
1
⊥
¬Intro
¬A A
⊥ ¬Elim
In the new derivation, the assumption A is discharged, so it is
a derivation from Γ.
Γ, [A]1
1
⊥
¬Intro
¬A
This shows that Γ ` ¬A.
Conversely, suppose Γ ` ¬A by a derivation δ1 . Then
Γ
δ1
¬A A
⊥ ¬Elim
shows that Γ ∪ {A} is inconsistent.
CHAPTER 7. NATURAL DEDUCTION 132
Γ
δ
¬A A
⊥ ¬Elim
Γ, [¬A]2 Γ, [A]1
δ2 δ1
2
⊥ 1
⊥
¬Intro ¬Intro
¬¬A ¬A
⊥ ¬Elim
2. A, B ` A ∧ B.
A∧B A∧B
∧Elim ∧Elim
A B
2. We can derive:
A B
∧Intro
A∧B
2. Both A ` A ∨ B and B ` A ∨ B.
¬A [A]1 ¬B [B]1
A∨B ⊥ ¬Elim ⊥ ¬Elim
1 ∨Elim
⊥
A B
∨Intro ∨Intro
A∨B A∨B
Proposition 7.25. 1. A, A → B ` B.
CHAPTER 7. NATURAL DEDUCTION 134
2. Both ¬A ` A → B and B ` A → B.
Proof. 1. We can derive:
A→B B
→Elim
B
2. This is shown by the following two derivations:
¬A [A]1
⊥ ⊥ ¬Elim B
→Intro
I
B A→B
1 →Intro
A→B
Note that →Intro may, but does not have to, discharge the
assumption A.
2. ∀x A(x) ` A(t ).
A(t )
∃Intro
∃x A(x)
2. The following is a derivation of A(t ) from ∀x A(x):
∀x A(x)
∀Elim
A(t )
7.6. SOUNDNESS 135
7.6 Soundness
A derivation system, such as natural deduction, is sound if it
cannot derive things that do not actually follow. Soundness is
thus a kind of guaranteed safety property for derivation systems.
Depending on which proof theoretic property is in question, we
would like to know for instance, that
Γ, [A]n
δ1
⊥
n ¬Intro
¬A
Γ
δ1
A∧B
∧Elim
A
Γ
δ1
A
∨Intro
A∨B
Γ, [A]n
δ1
B
n →Intro
A→B
Γ
δ1
⊥ ⊥
I
A
Γ
δ1
A(a)
∀Intro
∀x A(x)
Γ1 Γ2
δ1 δ2
A B
∧Intro
A∧B
By induction hypothesis, A follows from the undischarged
assumptions Γ1 of δ1 and B follows from the undischarged
assumptions Γ2 of δ2 . The undischarged assumptions of δ
are Γ1 ∪γ2 , so we have to show that Γ1 ∪Γ2 A∧B. Consider
a structure M with M |= Γ1 ∪ Γ2 . Since M |= Γ1 , it must be
the case that M |= A as Γ1 A, and since M |= Γ2 , M |= B
since Γ2 B. Together, M |= A ∧ B.
Γ1 Γ2
δ1 δ2
A→B A
→Elim
B
By induction hypothesis, A → B follows from the undis-
charged assumptions Γ1 of δ1 and A follows from the undis-
charged assumptions Γ2 of δ2 . Consider a structure M. We
need to show that, if M |= Γ1 ∪ Γ2 , then M |= B. Suppose
M |= Γ1 ∪ Γ2 . Since Γ1 A → B, M |= A → B. Since
CHAPTER 7. NATURAL DEDUCTION 140
Rules for =:
t = t =Intro
t1 = t2 A(t1 ) t1 = t2 A(t2 )
=Elim and =Elim
A(t2 ) A(t1 )
∀x ∀y ((A(x) ∧ A(y)) → x = y)
∃x ∀y (A(y) → y = x)
1
a =b
→Intro
((A(a) ∧ A(b)) → a = b)
∀Intro
∀y ((A(a) ∧ A(y)) → a = y)
∀Intro
∀x ∀y ((A(x) ∧ A(y)) → x = y)
We’ll now have to use the main assumption: since it is an existen-
tial formula, we use ∃Elim to derive the intermediary conclusion
a = b.
[∀y (A(y) → y = c )]2
[A(a) ∧ A(b)]1
∃x ∀y (A(y) → y = x) a =b
2 ∃Elim
1
a =b
→Intro
((A(a) ∧ A(b)) → a = b)
∀Intro
∀y ((A(a) ∧ A(y)) → a = y)
∀Intro
∀x ∀y ((A(x) ∧ A(y)) → x = y)
CHAPTER 7. NATURAL DEDUCTION 142
Γ1 Γ2
δ1 δ2
t1 = t2 A(t1 )
=Elim
A(t2 )
The premises t1 = t2 and A(t1 ) are derived from undischarged
assumptions Γ1 and Γ2 , respectively. We want to show that A(t2 )
follows from Γ1 ∪ Γ2 . Consider a structure M with M |= Γ1 ∪ Γ2 .
By induction hypothesis, M |= A(t1 ) and M |= t1 = t2 . Therefore,
ValM (t1 ) = ValM (t2 ). Let s be any variable assignment, and s 0 be
the x-variant given by s 0(x) = ValM (t1 ) = ValM (t2 ). By Proposi-
tion 5.46, M, s |= A(t1 ) iff M, s 0 |= A(x) iff M, s |= A(t2 ). Since
M |= A(t1 ), we have M |= A(t2 ).
7.8. SOUNDNESS WITH IDENTITY PREDICATE 143
Summary
Proof systems provide purely syntactic methods for characteriz-
ing consequence and compatibility between sentences. Natural
deduction is one such proof system. A derivation in it consists
of a tree of formulas. The topmost formulas in a derivation are
assumptions. All other formulas, for the derivation to be cor-
rect, must be correctly justified by one of a number of inference
rules. These come in pairs; an introduction and an elimination
rule for each connective and quantifier. For instance, if a for-
mula A is justified by a →Elim rule, the preceding formulas (the
premises) must be B → A and B (for some B). Some inference
rules also allow assumptions to be discharged. For instance, if
A → B is inferred from B using →Intro, any occurrences of A as
assumptions in the derivation leading to the premise B may be
discharged, given a label that is also recorded at the inference.
If there is a derivation with end formula A and all assumptions
are discharged, we say A is a theorem and write ` A. If all undis-
charged assumptions are in some set Γ, we say A is derivable
from Γ and write Γ ` A. If Γ ` ⊥ we say Γ is inconsistent, oth-
erwise consistent. These notions are interrelated, e.g., Γ ` A iff
Γ ∪ {¬A} is inconsistent. They are also related to the correspond-
ing semantic notions, e.g., if Γ ` A then Γ A. This property
of proof systems—what can be derived from Γ is guaranteed to
be entailed by Γ—is called soundness. The soundness theo-
rem is proved by induction on the length of derivations, showing
that each individual inference preserves entailment of its conclu-
sion from open assumptions provided its premises are entailed
by their open assumptions.
Problems
Problem 7.1. Give derivations of the following formulas:
1. ¬(A → B) → (A ∧ ¬B)
CHAPTER 7. NATURAL DEDUCTION 144
The
Completeness
Theorem
8.1 Introduction
The completeness theorem is one of the most fundamental re-
sults about logic. It comes in two formulations, the equivalence
of which we’ll prove. In its first formulation it says something fun-
damental about the relationship between semantic consequence
and our proof system: if a sentence A follows from some sen-
tences Γ, then there is also a derivation that establishes Γ ` A.
Thus, the proof system is as strong as it can possibly be without
proving things that don’t actually follow. In its second formula-
tion, it can be stated as a model existence result: every consistent
set of sentences is satisfiable.
These aren’t the only reasons the completeness theorem—or
rather, its proof—is important. It has a number of important con-
sequences, some of which we’ll discuss separately. For instance,
since any derivation that shows Γ ` A is finite and so can only
145
CHAPTER 8. THE COMPLETENESS THEOREM 146
has the property that it contains ∃x A(x) iff it contains A(t ) for
some closed term t and ∀x A(x) iff it contains A(t ) for all closed
terms t (Proposition 8.7). We’ll then take the saturated consistent
set Γ 0 and show that it can be extended to a saturated, consistent,
and complete set Γ ∗ (Lemma 8.8). This set Γ ∗ is what we’ll use
to define our term model M(Γ ∗ ). The term model has the set of
closed terms as its domain, and the interpretation of its predicate
symbols is given by the atomic sentences in Γ ∗ (Definition 8.9).
We’ll use the properties of consistent, complete, saturated sets to
show that indeed M(Γ ∗ ) |= A iff A ∈ Γ ∗ (Lemma 8.11), and thus
in particular, M(Γ ∗ ) |= Γ. Finally, we’ll consider how to define
a term model if Γ contains = as well (Definition 8.15) and show
that it satisfies Γ ∗ (Lemma 8.17).
1. If Γ ` A, then A ∈ Γ.
3. A ∨ B ∈ Γ iff either A ∈ Γ or B ∈ Γ.
Γ0 = Γ
Γn+1 = Γn ∪ {D n }
Γn ` ∃x n An (x n ) Γn ` ¬An (c n )
We’ll now show that complete, consistent sets which are satu-
rated have the property that it contains a universally quantified
8.5. LINDENBAUM’S LEMMA 153
2. Exercise.
Let Γ ∗ = n ≥0 Γn .
Ð
Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 =
Γn ∪ {An }, this is because the latter is consistent. If it isn’t,
Γn+1 = Γn ∪ {¬An }. We have to verify that Γn ∪ {¬An } is con-
sistent. Suppose it’s not. Then both Γn ∪ {An } and Γn ∪ {¬An }
are inconsistent. This means that Γn would be inconsistent by
Proposition 7.21, contrary to the induction hypothesis.
Every finite subset of Γ ∗ is a subset of Γn for some n, since
each B ∈ Γ ∗ not already in Γ is added at some stage i . If n is
the last one of these, then all B in the finite subset are in Γn . So,
every finite subset of Γ ∗ is consistent. By Proposition 7.18, Γ ∗ is
consistent.
Every sentence of Frm(L0) appears on the list used to de-
fine Γ ∗ . If An < Γ ∗ , then that is because Γn ∪ {An } was inconsis-
tent. But then ¬An ∈ Γ ∗ , so Γ ∗ is complete.
2. Exercise.
4. A ≡ B ∧ C : exercise.
8.7. IDENTITY 157
6. A ≡ B → C : exercise.
7. A ≡ ∀x B(x): exercise.
8.7 Identity
The construction of the term model given in the preceding sec-
tion is enough to establish completeness for first-order logic for
sets Γ that do not contain =. The term model satisfies every
A ∈ Γ ∗ which does not contain = (and hence all A ∈ Γ). It does
not work, however, if = is present. The reason is that Γ ∗ then
may contain a sentence t = t 0, but in the term model the value of
any term is that term itself. Hence, if t and t 0 are different terms,
their values in the term model—i.e., t and t 0, respectively—are
different, and so t = t 0 is false. We can fix this, however, using a
construction known as “factoring.”
1. ≈ is reflexive.
2. ≈ is symmetric.
3. ≈ is transitive.
1. |M/≈ | = Trm(L)/≈ .
2. c M/≈ = [c ]≈
3. f M/≈ ([t
1 ]≈, . . . , [tn ]≈ ) = [f (t1, . . . , tn )]≈
and
Proof. Note that the Γ’s in Corollary 8.19 and Theorem 8.18 are
universally quantified. To make sure we do not confuse ourselves,
let us restate Theorem 8.18 using a different variable: for any set
of sentences ∆, if ∆ is consistent, it is satisfiable. By contraposi-
tion, if ∆ is not satisfiable, then ∆ is inconsistent. We will use this
to prove the corollary.
Suppose that Γ A. Then Γ ∪ {¬A} is unsatisfiable by Propo-
sition 5.51. Taking Γ ∪ {¬A} as our ∆, the previous version of
Theorem 8.18 gives us that Γ ∪ {¬A} is inconsistent. By Proposi-
tion 7.13, Γ ` A.
∆ = {c , t : t ∈ Trm(L)}.
∆ = {A ≥n : n ≥ 1}
2. (A ∨ B) ∈ Γ iff either A ∈ Γ or B ∈ Γ.
Summary
The completeness theorem is the converse of the soundness
theorem. In one form it states that if Γ A then Γ ` A, in an-
other that if Γ is consistent then it is satisfiable. We proved the
second form (and derived the first from the second). The proof is
involved and requires a number of steps. We start with a consis-
tent set Γ. First we add infinitely many new constant symbols c i
as well as formulas of the form ∃x A(x) → A(c ) where each for-
mula A(x) with a free variable in the expanded language is paired
CHAPTER 8. THE COMPLETENESS THEOREM 168
Problems
Problem 8.1. Complete the proof of Proposition 8.2.
Problem 8.12. Write out the complete proof of the Truth Lemma
(Lemma 8.11) in the version required for the proof of Theo-
rem 8.30.
CHAPTER 9
Beyond
First-order
Logic
9.1 Overview
First-order logic is not the only system of logic of interest: there
are many extensions and variations of first-order logic. A logic
typically consists of the formal specification of a language, usu-
ally, but not always, a deductive system, and usually, but not
always, an intended semantics. But the technical use of the term
raises an obvious question: what do logics that are not first-order
logic have to do with the word “logic,” used in the intuitive or
philosophical sense? All of the systems described below are de-
signed to model reasoning of some form or another; can we say
what makes them logical?
No easy answers are forthcoming. The word “logic” is used
in different ways and in different contexts, and the notion, like
that of “truth,” has been analyzed from numerous philosophical
stances. For example, one might take the goal of logical reason-
170
9.2. MANY-SORTED LOGIC 171
jects they can take as arguments. Otherwise, one keeps the usual
rules of first-order logic, with versions of the quantifier-rules re-
peated for each sort.
For example, to study international relations we might choose
a language with two sorts of objects, French citizens and German
citizens. We might have a unary relation, “drinks wine,” for ob-
jects of the first sort; another unary relation, “eats wurst,” for
objects of the second sort; and a binary relation, “forms a multi-
national married couple,” which takes two arguments, where the
first argument is of the first sort and the second argument is of
the second sort. If we use variables a, b, c to range over French
citizens and x, y, z to range over German citizens, then
∀a ∀x[(Mar r i edT o(a, x) → (Dr i nksW i ne(a)∨¬EatsW ur st(x))]]
asserts that if any French person is married to a German, either
the French person drinks wine or the German doesn’t eat wurst.
Many-sorted logic can be embedded in first-order logic in a
natural way, by lumping all the objects of the many-sorted do-
mains together into one first-order domain, using unary predicate
symbols to keep track of the sorts, and relativizing quantifiers.
For example, the first-order language corresponding to the exam-
ple above would have unary predicate symbolss “Ger man” and
“F r ench,” in addition to the other relations described, with the
sort requirements erased. A sorted quantifier ∀x A, where x is a
variable of the German sort, translates to
∀x (Ger man(x) → A).
We need to add axioms that insure that the sorts are separate—
e.g., ∀x ¬(Ger man(x) ∧ F r ench(x))—as well as axioms that guar-
antee that “drinks wine” only holds of objects satisfying the pred-
icate F r ench(x), etc. With these conventions and axioms, it is
not difficult to show that many-sorted sentences translate to first-
order sentences, and many-sorted derivations translate to first-
order derivations. Also, many-sorted structures “translate” to cor-
responding first-order structures and vice-versa, so we also have
a completeness theorem for many-sorted logic.
9.3. SECOND-ORDER LOGIC 173
particular you can quantify over these sets; for example, one can
express induction for the natural numbers with a single axiom
1. ∀x ¬x 0 =
2. ∀x ∀y (s (x) = s (y) → x = y)
3. ∀x (x + ) = x
4. ∀x ∀y (x + y 0) = (x + y)0
5. ∀x (x × ) =
6. ∀x ∀y (x × y 0) = ((x × y) + x)
7. ∀x ∀y (x < y ↔ ∃z y = (x + z 0))
The negation of this sentence then defines the class of finite struc-
tures.
In addition, one can define the class of well-orderings, by
adding the following to the definition of a linear ordering:
This asserts that every non-empty set has a least element, modulo
the identification of “set” with “one-place relation”. For another
example, one can express the notion of connectedness for graphs,
by saying that there is no nontrivial separation of the vertices into
disconnected parts:
As you may have guessed, one can iterate this idea arbitrarily.
In practice, higher-order logic is often formulated in terms
of functions instead of relations. (Modulo the natural identifica-
tions, this difference is inessential.) Given some basic “sorts” A,
B, C , . . . (which we will now call “types”), we can create new ones
by stipulating
If σ and τ are finite types then so is σ → τ.
Think of types as syntactic “labels,” which classify the objects
we want in our domain; σ → τ describes those objects that are
functions which take objects of type σ to objects of type τ. For
example, we might want to have a type Ω of truth values, “true”
and “false,” and a type N of natural numbers. In that case, you
can think of objects of type N → Ω as unary relations, or sub-
sets of N; objects of type N → N are functions from natural nu-
mers to natural numbers; and objects of type (N → N) → N are
“functionals,” that is, higher-type functions that take functions to
numbers.
As in the case of second-order logic, one can think of higher-
order logic as a kind of many-sorted logic, where there is a sort for
each type of object we want to consider. But it is usually clearer
just to define the syntax of higher-type logic from the ground up.
For example, we can define a set of finite types inductively, as
follows:
1. N is a finite type.
2. If σ and τ are finite types, then so is σ → τ.
3. If σ and τ are finite types, so is σ × τ.
Intuitively, N denotes the type of the natural numbers, σ → τ
denotes the type of functions from σ to τ, and σ × τ denotes the
type of pairs of objects, one from σ and one from τ. We can then
define a set of terms inductively, as follows:
1. For each type σ, there is a stock of variables x, y, z , . . . of
type σ
CHAPTER 9. BEYOND FIRST-ORDER LOGIC 180
2. is a term of type N
Rs t (0) = s
Rs t (x + 1) = t (x, R s t (x)),
hs, t i denotes the pair whose first component is s and whose sec-
ond component is t , and p 1 (s ) and p 2 (s ) denote the first and
second elements (“projections”) of s . Finally, λx . s denotes the
function f defined by
f (x) = s
for any x of type σ; so item (6) gives us a form of comprehension,
enabling us to define functions using terms. Formulas are built
up from identity predicate statements s = t between terms of the
same type, the usual propositional connectives, and higher-type
quantification. One can then take the axioms of the system to be
the basic equations governing the terms defined above, together
with the usual rules of logic with quantifiers and identity predi-
cate.
9.5. INTUITIONISTIC LOGIC 181
since 3log3 x = x.
Intuitionistic logic is designed to model a kind of reasoning
where moves like the one in the first proof are disallowed. Proving
the existence of an x satisfying A(x) means that you have to give a
specific x, and a proof that it satisfies A, like in the second proof.
Proving that A or B holds requires that you can prove one or the
other.
Formally speaking, intuitionistic first-order logic is what you
get if you omit restrict a proof system for first-order logic in a
certain way. Similarly, there are intuitionistic versions of second-
order or higher-order logic. From the mathematical point of view,
these are just formal deductive systems, but, as already noted,
they are intended to model a kind of mathematical reasoning.
One can take this to be the kind of reasoning that is justified on
a certain philosophical view of mathematics (such as Brouwer’s
intuitionism); one can take it to be a kind of mathematical rea-
soning which is more “concrete” and satisfying (along the lines
of Bishop’s constructivism); and one can argue about whether or
not the formal description captures the informal motivation. But
whatever philosophical positions we may hold, we can study in-
tuitionistic logic as a formally presented logic; and for whatever
reasons, many mathematical logicians find it interesting to do so.
There is an informal constructive interpretation of the intu-
itionist connectives, usually known as the Brouwer-Heyting-Kolmogorov
interpretation. It runs as follows: a proof of A ∧ B consists of a
proof of A paired with a proof of B; a proof of A ∨ B consists
of either a proof of A, or a proof of B, where we have explicit
information as to which is the case; a proof of A → B consists
CHAPTER 9. BEYOND FIRST-ORDER LOGIC 184
1. (A → ⊥) → ¬A.
2. A ∨ ¬A
3. ¬¬A → A
(A ∨ B)N ≡ ¬¬(AN ∨ B N )
(A → B)N ≡ (AN → B N )
(∀x A)N ≡ ∀x AN
(∃x A)N ≡ ¬¬∃x AN
2. P, p 1 ⊥.
3. P, p (A ∧ B) iff P, p A and P, p B.
4. P, p (A ∨ B) iff P, p A or P, p B.
One would like to augment logic with rules and axioms deal-
ing with modality. For example, the system S4 consists of the
ordinary axioms and rules of propositional logic, together with
the following axioms:
♦A → ♦A
Turing
Machines
193
CHAPTER 10
Turing
Machine
Computations
10.1 Introduction
194
10.1. INTRODUCTION 195
. I I I t I I I I t t t
q1
with a t, move right to the fourth square, and change the state
of the machine to q 5 .
We say that the machine halts when it encounters some state,
q n , and symbol, σ such that there is no instruction for hq n, σi,
i.e., the transition function for input hq n, σi is undefined. In other
words, the machine has no instruction to carry out, and at that
point, it ceases operation. Halting is sometimes represented by
a specific halt state h. This will be demonstrated in more detail
later on.
The beauty of Turing’s paper, “On computable numbers,”
is that he presents not only a formal definition, but also an ar-
gument that the definition captures the intuitive notion of com-
putability. From the definition, it should be clear that any func-
tion computable by a Turing machine is computable in the intu-
itive sense. Turing offers three types of argument that the con-
verse is true, i.e., that any function that we would naturally regard
as computable is computable by such a machine. They are (in
Turing’s words):
t, I , R
start q0 q1
Recall that the Turing machine has a read/write head and a tape
with the input written on it. The instruction can be read as if
reading a blank in state q 0 , write a stroke, move right, and move to
state q 1 . This is equivalent to the transition function mapping
hq 0, ti to hq 1, I , Ri.
CHAPTER 10. TURING MACHINE COMPUTATIONS 198
start q0 q1
I,I,R
.I I 1 I I t . . .
.I I I I 1 t . . .
.I I I I t0 . . .
The machine is now in state q 0 scanning a blank. Based on the
transition diagram, we can easily see that there is no instruction
to be carried out, and thus the machine has halted. This means
that the input has been accepted.
Suppose next we start the machine with an input of three
strokes. The first few configurations are similar, as the same in-
structions are carried out, with only a small difference of the tape
input:
.I 0 I I t . . .
.I I 1 I t . . .
.I I I 0 t . . .
.I I I t1 . . .
The machine has now traversed past all the strokes, and is read-
ing a blank in state q 1 . As shown in the diagram, there is an
instruction of the form δ(q 1, t) = hq 1, t, Ri. Since the tape is in-
finitely blank to the right, the machine will continue to execute
this instruction forever, staying in state q 1 and moving ever further
CHAPTER 10. TURING MACHINE COMPUTATIONS 200
to the right. The machine will never halt, and does not accept
the input.
It is important to note that not all machines will halt. If halt-
ing means that the machine runs out of instructions to execute,
then we can create a machine that never halts simply by ensuring
that there is an outgoing arrow for each symbol at each state.
The even machine can be modified to run infinitely by adding an
instruction for scanning a blank at q 0 .
Example 10.2.
t, t, R t, t, R
I,I,R
start q0 q1
I,I,R
I,I,R I,I,R
I , t, R t, t, R
start q0 q1 q2
t, t, R t, I , R
q5 q4 q3
t, t, L I,I,L
I,I,L I,I,L t, I , L
CHAPTER 10. TURING MACHINE COMPUTATIONS 202
3. an initial state q 0 ∈ Q ,
Q = {q 0, q 1 }
Σ = {., t, I },
δ(q 0, I ) = hq 1, I , Ri,
δ(q 1, I ) = hq 0, I , Ri,
δ(q 1, t) = hq 1, t, Ri.
10.4. CONFIGURATIONS AND COMPUTATIONS 203
3. q ∈ Q
the right of the left end marker), and the mechanism is in the
designated start state q 0 .
h. _ I , 1, q 0 i
I k 1 t I k 2 t . . . t I kn
I,I,R I,I,R I , t, N
t, I , R t, t, L
start q0 q1 q2
start q0 q1
I,I,R
t, t, N
h
10.7. COMBINING TURING MACHINES 207
I,I,R
start q0 q1
I,I,R
t, t, N t, t, N
h r
t, I , R t, t, L
start q0 q1 q2
t, I , R t, t, L
start q0 q1 q2
I , t, L
I,I,L q3
., ., R
q4
10.8. VARIANTS OF TURING MACHINES 209
I,I,R I,I,R
t, I , R t, t, L
start q0 q1 q2
I , t, L
I,I,L q3
I,I,L ., ., R
t, t, L t, t, R
q8 q9 q4
I,I,L I,I,L I , t, R
t, I , L q7 q6 q5
t, I , R t, t, R
I,I,R I,I,R
Summary
A Turing machine is a kind of idealized computation mecha-
nism. It consists of a one-way infinite tape, divided into squares,
each of which can contain a symbol from a pre-determined al-
phabet. The machine operates by moving a read-write head
along the tape. It may also be in one of a pre-determined num-
ber of states. The actions of the read-write head are determined
by a set of instructions; each instruction is conditional on the ma-
chine being in a certain state and reading a certain symbol, and
specifies which symbol the machine will write onto the current
square, whether it will move the read-write head one square left,
right, or stay put, and which state it will switch to. If the tape
contains a certain input, represented as a sequence of symbols
on the tape, and the machine is put into the designated start state
with the read-write head reading the leftmost square of the input,
the instruction set will step-wise determine a sequence of config-
urations of the machine: content of tape, position of read-write
head, and state of the machine. Should the machine encounter
a configuration in which the instruction set does not contain an
instruction for the current symbol read/state combination, the
machine halts, and the content of the tape is the output.
Numbers can very easily be represented as sequences of strokes
on the Tape of a Turing machine. We say a function N → N is
Turing computable if there is a Turing machine which, when-
ever it is started on the unary representation of n as input, eventu-
ally halts with its tape containing the unary representation of f (n)
10.9. THE CHURCH-TURING THESIS 213
Problems
Problem 10.1. Choose an arbitary input and trace through the
configurations of the doubler machine in Example 10.4.
The machine should leave the input string on the tape, and out-
put either halt if the string is “alphabetical”, or loop forever if
the string is not.
Undecidability
11.1 Introduction
It might seem obvious that not every function, even every arith-
metical function, can be computable. There are just too many,
whose behavior is too complicated. Functions defined from the
decay of radioactive particles, for instance, or other chaotic or
random behavior. Suppose we start counting 1-second intervals
from a given time, and define the function f (n) as the number
of particles in the universe that decay in the n-th 1-second inter-
val after that initial moment. This seems like a candidate for a
function we cannot ever hope to compute.
But it is one thing to not be able to imagine how one would
compute such functions, and quite another to actually prove that
they are uncomputable. In fact, even functions that seem hope-
lessly complicated may, in an abstract sense, be computable. For
instance, suppose the universe is finite in time—some day, in the
very distant future the universe will contract into a single point,
as some cosmological theories predict. Then there is only a fi-
nite (but incredibly large) number of seconds from that initial
moment for which f (n) is defined. And any function which is
defined for only finitely many inputs is computable: we could list
the outputs in one big table, or code it in one very big Turing
machine state transition diagram.
215
CHAPTER 11. UNDECIDABILITY 216
assume, for instance, that the states and vocabulary symbols are
natural numbers, or that the states and vocabulary are all strings
of letters and digits.
Suppose we fix a countably infinite vocabulary for specifying
Turing machines: σ0 = ., σ1 = t, σ2 = I , σ3 , . . . , R, L, N ,
q 0 , q 1 , . . . . Then any Turing machine can be specified by some
finite string of symbols from this alphabet (though not every fi-
nite string of symbols specifies a Turing machine). For instance,
suppose we have a Turing machine M = hQ, Σ , q, δi where
Q = {q 00, . . . , q n0 } ⊆ {q 0, q 1, . . . } and
Σ = {., σ10 , σ20 , . . . , σm0 } ⊆ {σ0, σ1, . . . }.
Theorem 11.1. There are functions from N to N which are not Turing
computable.
3. A constant symbol
0=
n + 1 = n0
∀x ∀y ((Qqi (x 0, y) ∧ Sσ (x 0, y)) →
(Qq j (x, y 0) ∧ Sσ0 (x 0, y 0) ∧ A(x, y))) ∧
∀y ((Qqi (, y) ∧ Sσ (, y)) →
(Qq j (, y 0) ∧ Sσ0 (, y 0) ∧ A(, y)))
∃x ∃y Qh (x, y)
Proof. Exercise.
The strategy for proving these is very different. For the first
result, we have to show that a sentence of first-order logic (namely,
T (M, w) → E(M, w)) is valid. The easiest way to do this is to give
a derivation. Our proof is supposed to work for all M and w,
though, so there isn’t really a single sentence for which we have
to give a derivation, but infinitely many. So the best we can do
is to prove by induction that, whatever M and w look like, and
however many steps it takes M to halt on input w, there will be
a derivation of T (M, w) → E(M, w).
Naturally, our induction will proceed on the number of steps
M takes before it reaches a halting configuration. In our induc-
tive proof, we’ll establish that for each step n of the run of M
on input w, T (M, w) C (M, w, n), where C (M, w, n) correctly de-
scribes the configuration of M run on w after n steps. Now if
M halts on input w after, say, n steps, C (M, w, n) will describe a
halting configuration. We’ll also show that C (M, w, n) E(M, w),
whenever C (M, w, n) describes a halting configuration. So, if M
halts on input w, then for some n, M will be in a halting con-
figuration after n steps. Hence, T (M, w) C (M, w, n) where
C (M, w, n) describes a halting configuration, and since in that
case C (M, w, n) E(M, w), we get that T (M, w) E(M, w), i.e.,
that T (M, w) → E(M, w).
The strategy for the converse is very different. Here we as-
sume that T (M, w) → E(M, w) and have to prove that M halts
on input w. From the hypothesis we get that T (M, w) E(M, w),
i.e., E(M, w) is true in every structure in which T (M, w) is true. So
we’ll describe a structure M in which T (M, w) is true: its domain
will be N, and the interpretation of all the Qq and Sσ will be given
by the configurations of M during a run on input w. So, e.g.,
M |= Qq (m, n) iff T , when run on input w for n steps, is in state q
and scanning square m. Now since T (M, w) E(M, w) by hy-
pothesis, and since M |= T (M, w) by construction, M |= E(M, w).
But M |= E(M, w) iff there is some n ∈ |M| = N so that M , run
on input w, is in a halting configuration after n steps.
CHAPTER 11. UNDECIDABILITY 228
Lemma 11.11. For each n, if M has not halted after n steps, T (M, w)
C (M, w, n).
1. δ(q, σ) = hq 0, σ 0, Ri
2. δ(q, σ) = hq 0, σ 0, Li
3. δ(q, σ) = hq 0, σ 0, N i
We now get
Qq 0 (m 0, n 0) ∧ Sσ0 (m, n 0) ∧
Sσ0 (0, n 0) ∧ · · · ∧ Sσk (k, n 0) ∧
∀x (k < x → St (x, n 0))
∀x ∀y ((Qq (x 0, y) ∧ Sσ (x 0, y)) →
(Qq 0 (x, y 0) ∧ Sσ0 (x 0, y 0) ∧ A(x, y))) ∧
∀y ((Qqi (, y) ∧ Sσ (, y)) →
(Qq j (, y 0) ∧ Sσ0 (, y 0) ∧ A(, y)))
Proof. By Lemma 11.11, we know that, for any time n, the de-
scription C (M, w, n) of the configuration of M at time n is entailed
by T (M, w). Suppose M halts after k steps. It will be scanning
square m, say. Then C (M, w, k ) describes a halting configuration
of M , i.e., it contains as conjuncts both Qq (m, k ) and Sσ (m, k )
with δ(q, σ) undefined. By Lemma 11.10 Thus, C (M, w, k )
E(M, w). But since (M, w) C (M, w, k ), we have T (M, w)
E(M, w) and therefore T (M, w) → E(M, w) is valid.
Summary
Turing machines are determined by their instruction sets, which
are finite sets of quintuples (for every state and symbol read, spec-
ify new state, symbol written, and movement of the head). The
finite sets of quintuples are enumerable, so there is a way of as-
sociating a number with each Turing machine instruction set.
The index of a Turing machine is the number associated with
its instruction set under a fixed such schema. In this way we can
CHAPTER 11. UNDECIDABILITY 234
Problems
Problem 11.1. The Three Halting (3-Halt) problem is the prob-
lem of giving a decision procedure to determine whether or not
an arbitrarily chosen Turing Machine halts for an input of three
strokes on an otherwise blank tape. Prove that the 3-Halt problem
is unsolvable.
Problem 11.6. Give a derivation of Sσi (i, n 0) from Sσi (i, n) and
A(m, n) (assuming i , m, i.e., either i < m or m < i ).
0
Problem 11.7. Give a derivation of ∀x (k < x → St (x, n 0)) from
∀x (k < x → St (x, n 0)), ∀x x < x 0, and ∀x ∀y ∀z ((x < y ∧ y <
z ) → x < z ).)
APPENDIX A
Proofs
A.1 Introduction
Based on your experiences in introductory logic, you might be
comfortable with a proof system—probably a natural deduction
or Fitch style proof system, or perhaps a proof-tree system. You
probably remember doing proofs in these systems, either proving
a formula or show that a given argument is valid. In order to do
this, you applied the rules of the system until you got the desired
end result. In reasoning about logic, we also prove things, but
in most cases we are not using a proof system. In fact, most of
the proofs we consider are done in English (perhaps, with some
symbolic language thrown in) rather than entirely in the language
of first-order logic. When constructing such proofs, you might at
first be at a loss—how do I prove something without a proof
system? How do I start? How do I know if my proof is correct?
Before attempting a proof, it’s important to know what a proof
is and how to construct one. As implied by the name, a proof is
meant to show that something is true. You might think of this in
terms of a dialogue—someone asks you if something is true, say,
if every prime other than two is an odd number. To answer “yes”
is not enough; they might want to know why. In this case, you’d
give them a proof.
In everyday discourse, it might be enough to gesture at an
237
APPENDIX A. PROOFS 238
Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as con-
clusion one of the conjuncts of a conjunction. In other words:
if we have assumed or already proved that p and q , then we’re
entitled to infer that p (and also that q ). This is such a basic
inference that it is often not mentioned. For instance, once we’ve
unpacked the definition of U = V we’ve established that every
element of U is an element of V and vice versa. From this we
can conclude that every element of V is an element of U (that’s
the “vice versa” part).
A.4. INFERENCE PATTERNS 243
Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a
conjunction; you will be asked to “prove p and q .” In this case,
you simply have to do two things: prove p, and then prove q . You
could divide your proof into two sections, and for clarity, label
them. When you’re making your first notes, you might write “(1)
Prove p” at the top of the page, and “(2) Prove q ” in the middle of
the page. (Of course, you might not be explicitly asked to prove
a conjunction but find that your proof requires that you prove a
conjunction. For instance, if you’re asked to prove that U = V
you will find that, after unpacking the definition of =, you have to
prove: every element of U is an element of V and every element
of V is an element of U ).
Conditional Proof
Many theorems you will encounter are in conditional form (i.e.,
show that if p holds, then q is also true). These cases are nice and
easy to set up—simply assume the antecedent of the conditional
(in this case, p) and prove the conclusion q from it. So if your
theorem reads, “If p then q ,” you start your proof with “assume
p” and at the end you should have proved q .
Recall that a biconditional (p iff q ) is really two conditionals
put together: if p then q , and if q then p. All you have to do, then,
is two instances of conditional proof: one for the first instance
and one for the second. Sometimes, however, it is possible to
prove an “iff” statement by chaining together a bunch of other
“iff” statements so that you start with “p” an end with “q ”—but
in that case you have to make sure that each step really is an “iff.”
Universal Claims
Using a universal claim is simple: if something is true for any-
thing, it’s true for each particular thing. So if, say, the hypothesis
of your proof is X ⊆ Y , that means (unpacking the definition
APPENDIX A. PROOFS 244
Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it
is an statement of the form “p or q ”), it is enough to show that one
of the disjuncts is true. However, it basically never happens that
either disjunct just follows from the assumptions of your theorem.
More often, the assumptions of your theorem are themselves dis-
junctive, or you’re showing that all things of a certain kind have
one of two properties, but some of the things have the one and
others have the other property. This is where proof by cases is
useful.
A.4. INFERENCE PATTERNS 245
Proof by Cases
Suppose you have a disjunction as an assumption or as an already
established conclusion—you have assumed or proved that p or q
is true. You want to prove r . You do this in two steps: first you
assume that p is true, and prove r , then you assume that q is true
and prove r again. This works because we assume or know that
one of the two alternatives holds. The two steps establish that
either one is sufficient for the truth of r . (If both are true, we
have not one but two reasons for why r is true. It is not neces-
sary to separately prove that r is true assuming both p and q .)
To indicate what we’re doing, we announce that we “distinguish
cases.” For instance, suppose we know that x ∈ Y ∪ Z . Y ∪ Z is
defined as {x : x ∈ Y or x ∈ Z }. In other words, by definition,
x ∈ Y or x ∈ Z . We would prove that x ∈ X from this by first
assuming that x ∈ Y , and proving x ∈ X from this assumption,
and then assume x ∈ Z , and again prove x ∈ X from this. You
would write “We distinguish cases” under the assumption, then
“Case (1): x ∈ Y ” underneath, and “Case (2): x ∈ Z halfway
down the page. Then you’d proceed to fill in the top half and the
bottom half of the page.
Proof by cases is especially useful if what you’re proving is
itself disjunctive. Here’s a simple example:
the “if” part is true, and we’ll go on to show that the “then” part
is true as well. In other words, we’ll assume that x ∈ Y or x ∈ Z
and show that x ∈ U or x ∈ V .)
Suppose that x ∈ Y or x ∈ Z . We have to show that x ∈ U or
x ∈ V . We distinguish cases.
Case 1: x ∈ Y . By (c), x ∈ U . Thus, x ∈ U or x ∈ V . (Here
we’ve made the inference discussed in the preceding subsection!)
Case 2: x ∈ Z . By (d), x ∈ V . Thus, x ∈ U or x ∈ V .
It’s maybe good practice to keep bound variables like “x” sep-
arate from hypothtical names like a, like we did. In practice,
however, we often don’t and just use x, like so:
Can you spot where the incorrect step occurs and explain why
the result does not hold?
A.5 An Example
Our first example is the following simple fact about unions and in-
tersections of sets. It will illustrate unpacking definitions, proofs
of conjunctions, of universal claims, and proof by cases.
A.5. AN EXAMPLE 249
By definition, X ∪ (Y ∩ Z ) = (X ∪ Y ) ∩ (X ∪ Z ) iff
every element of X ∪ (Y ∩ Z ) is also an element of
(X ∪ Y ) ∩ (X ∪ Z ), and every element of (X ∪ Y ) ∩
(X ∪ Z ) is an element of X ∪ (Y ∩ Z ).
So, if z ∈ X ∪ (Y ∩ Z ) then z ∈ (X ∪ Y ) ∩ (X ∪ Z ).
Now we just want to show the other direction, that every ele-
ment of (X ∪Y ) ∩ (X ∪Z ) is an element of X ∪ (Y ∩Z ). As before,
we prove this universal claim by assuming we have an arbitrary
element of the first set and show it must be in the second set.
Let’s state what we’re about to do.
By definition of ∩, z ∈ X ∪ Y and z ∈ X ∪ Z . By
definition of ∪, z ∈ X or z ∈ Y . We distinguish
cases.
Now for the second case, z ∈ Y . Here we’ll unpack the second
∪ and do another proof-by-cases:
Ok, this was a bit weird. We didn’t actually need the assump-
tion that z ∈ Y for this case, but that’s ok.
So, if z ∈ (X ∪ Y ) ∩ (X ∪ Z ) then z ∈ X ∪ (Y ∩ Z ).
Together, we’ve showed that X ∪ (Y ∩ Z ) = (X ∪Y ) ∩
(X ∪ Z ).
A.6. ANOTHER EXAMPLE 253
X ∪ (Z \ X ) = Z iff X ∪ (Z \ X ) ⊆ Z and Z ⊆
(X ∪ (Z \ X ). First we prove that X ∪ (Z \ X ) ⊆ Z .
Let z ∈ X ∪ (Z \ X ). So, either z ∈ X or z ∈ (Z \ X ).
Here we’ve used the fact recorded earlier which followed from
the hypothesis of the proposition that X ⊆ Z . The first case is
complete, and we turn to the second case, z ∈ (Z \ X ). Recall
that Z \ X denotes the difference of the two sets, i.e., the set of
all elements of Z which are not elements of X . Let’s use state
what the definition gives us. But an element of Z not in X is in
particular an element of Z .
Great, we’ve solved the first direction. Now for the second
direction. Here we prove that Z ⊆ X ∪ (Z \ X ). So we assume
that z ∈ Z and prove that z ∈ X ∪ (Z \ X ).
X = ∅ iff there is no x ∈ X .
Since X ⊆ Y , x ∈ Y .
Proposition A.8. X ⊆ X ∪ Y .
X ∩ (X ∪ Y ) = X
Proof. If z ∈ X ∩ (X ∪ Y ), then z ∈ X , so X ∩ (X ∪ Y ) ⊆ X .
Now suppose z ∈ X . Then also z ∈ X ∪ Y , and therefore also
z ∈ X ∩ (X ∪ Y ).
that” before we prove it, etc. Let’s unpack it. The proposition
proved is a general claim about any sets X and Y , and when the
proof mentions X or Y , these are variables for arbitrary sets. The
general claims the proof establishes is what’s required to prove
identity of sets, i.e., that every element of the left side of the
identity is an element of the right and vice versa.
“If z ∈ X ∩ (X ∪Y ), then z ∈ X , so X ∩ (X ∪Y ) ⊆ X .”
them for the next step. And when you do get it, recipro-
cate. Helping someone else along, and explaining things
will help you understand better, too.
Motivational Videos
Feel like you have no motivation to do your homework? Feeling
down? These videos might help!
• https://www.youtube.com/watch?v=ZXsQAXx_ao0
• https://www.youtube.com/watch?v=BQ4yd2W50No
• https://www.youtube.com/watch?v=StTqXEQ2l-Y
Problems
Problem A.1. Suppose you are asked to prove that X ∩ Y , ∅.
Unpack all the definitions occuring here, i.e., restate this in a way
that does not mention “∩”, “=”, or “∅.
Problem A.2. Prove indirectly that X ∩ Y ⊆ X .
Problem A.3. Expand the following proof of X ∪ (X ∩ Y ) =
X , where you mention all the inference patterns used, why each
step follows from assumptions or claims established before it, and
where we have to appeal o which definitions.
APPENDIX A. PROOFS 264
Proof. If z ∈ X ∪ (X ∩Y ) then z ∈ X or z ∈ X ∩Y . If z ∈ X ∩Y ,
z ∈ X . Any z ∈ X is also ∈ X ∪ (X ∩ Y ).
APPENDIX B
Induction
B.1 Introduction
Induction is an important proof technique which is used, in dif-
ferent forms, in almost all areas of logic, theoretical computer
science, and mathematics. It is needed to prove many of the re-
sults in logic.
Induction is often contrasted with deduction, and character-
ized as the inference from the particular to the general. For in-
stance, if we observe many green emeralds, and nothing that we
would call an emerald that’s not green, we might conclude that
all emeralds are green. This is an inductive inference, in that it
proceeds from many particlar cases (this emerald is green, that
emerald is green, etc.) to a general claim (all emeralds are green).
Mathematical induction is also an inference that concludes a gen-
eral claim, but it is of a very different kind that this “simple in-
duction.”
Very roughly, and inductive proof in mathematics concludes
that all mathematical objects of a certain sort have a certain prop-
erty. In the simplest case, the mathematical objects an induc-
tive proof is concerned with are natural numbers. In that case
an inductive proof is used to establish that all natural numbers
have some property, and it does this by showing that (1) 0 has
the property, and (2) whenever a number n has the property, so
265
APPENDIX B. INDUCTION 266
B.2 Induction on N
In its simplest form, induction is a technique used to prove results
for all natural numbers. It uses the fact that by starting from 0 and
repeatedly adding 1 we eventually reach every natural number.
So to prove that something is true for every number, we can (1)
establish that it is true for 0 and (2) show that whenever a number
has it, the next number has it too. If we abbreviate “number n
has property P ” by P (n), then a proof by induction that P (n) for
all n ∈ N consists of:
To make this crystal clear, suppose we have both (1) and (2).
Then (1) tells us that P (0) is true. If we also have (2), we know
in particular that if P (0) then P (0 + 1), i.e., P (1). (This follows
from the general statement “for any n, if P (n) then P (n + 1)” by
putting 0 for n. So by modus ponens, we have that P (1). From
(2) again, now taking 1 for n, we have: if P (1) then P (2). Since
we’ve just established P (1), by modus ponens, we have P (2). And
so on. For any number k , after doing this k steps, we eventually
B.2. INDUCTION ON N 267
Theorem B.1. With n dice one can throw all 5n + 1 possible values
between n and 6n.
Proof. Let P (n) be the claim: “It is possible to throw any number
between n and 6n using n dice.” To use induction, we prove:
1. The induction basis P (1), i.e., with just one die, you can
throw any number between 1 and 6.
s0 = 0
sn+1 = sn + (n + 1)
s0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 =1+2=3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.
in question for any number under the assumption it holds for its
predecessor.
There is a variant of the principle of induction in which we
don’t just assume that the claim holds for the predecessor n − 1
of n, but for all numbers smaller than n, and use this assumption
to establish the claim for n. This also gives us the claim P (k ) for
all k ∈ N. For once we have established P (0), we have thereby
established that P holds for all numbers less than 1. And if we
know that if P (l ) for all l < n then P (n), we know this in particular
for n = 1. So we can conclude P (2). With this we have proved
P (0), P (1), P (2), i.e., P (l ) for all l < 3, and since we have also the
conditional, if P (l ) for all l < 3, then P (3), we can conclude P (3),
and so on.
In fact, if we can establish the general conditional “for all n,
if P (l ) for all l < n, then P (n),” we do not have to establish P (0)
anymore, since it follows from it. For remember that a general
claim like “for all l < n, P (l )” is true if there are no l < n. This
is a case of vacuous quantification: “all As are Bs” is true if there
are no As, ∀x (A(x) → B(x)) is true if no x satisfies A(x). In this
case, the formalized version would be “∀l (l < n → P (l ))”—and
that is true if there are no l < n. And if n = 0 that’s exactly the
case: no l < 0, hence “for all l < 0, P (0)” is true, whatever P is.
A proof of “if P (l ) for all l < n, then P (n)” thus automatically
establishes P (0).
This variant is useful if establishing the claim for n can’t be
made to just rely on the claim for n − 1 but may require the
assumption that it is true for one or more l < n.
1. ∅ is a parexpression.
(Note that we have not yet proved that every balanced paren-
thesis expression is a parexpression, although it is quite clear that
every parexpression is a balanced parenthesis expression.)
The key feature of inductive definitions is that if you want to
prove something about all parexpressions, the definition tells you
which cases you must consider. For instance, if you are told that
q is a parexpression, the inductive definition tells you what q can
look like: q can be ∅, it can be (p) for some other parexpression p,
or it can be pp 0 for two parexpressions p and p 0 , ∅. Because of
clause (4), those are all the possibilities.
When proving claims about all of an inductively defined set,
the strong form of induction becomes particularly important. For
APPENDIX B. INDUCTION 272
o 1 (p) =(p)
o 2 (q, q 0) =q q 0
Biographies
C.1 Georg Cantor
An early biography of Georg Can-
tor (gay-org kahn-tor) claimed that
he was born and found on a ship
that was sailing for Saint Peters-
burg, Russia, and that his parents
were unknown. This, however, is
not true; although he was born in
Saint Petersburg in 1845.
Cantor received his doctorate
in mathematics at the University of
Berlin in 1867. He is known for his
work in set theory, and is credited
with founding set theory as a dis-
tinctive research discipline. He was Fig. C.1: Georg Cantor
the first to prove that there are infi-
nite sets of different sizes. His theories, and especially his theory
of infinities, caused much debate among mathematicians at the
time, and his work was controversial.
Cantor’s religious beliefs and his mathematical work were in-
extricably tied; he even claimed that the theory of transfinite num-
bers had been communicated to him directly by God. In later
275
APPENDIX C. BIOGRAPHIES 276
291
GLOSSARY 292
section 6.1).
compactness theorem States that every finitely satisfiable set of
sentences is satisfiable (see section 8.9).
completeness Property of a proof system; it is complete if, when-
ever Γ entails A, then there is also a derivation that es-
tablishes Γ ` A; equivalently, iff every consistent set of
sentences is satisfiable (see section 8.1).
completeness theorem States that first-order logic is complete:
every consistent set of sentences is satisfiable.
composition (g ◦ f ) The function resulting from “chaining to-
gether” f and g ; (g ◦ f )(x) = g (f (x)) (see section 3.4).
connected R is connected if for all x, y ∈ X with x , y, then
either Rxy or Ryx (see section 2.2).
consistent A set of sentences Γ is consistent iff Γ 0 ⊥, otherwise
inconsistent (see section 7.4).
covered A structure in which every element of the domain is the
value of some closed term (see section 5.9).
297
Photo Credits 298
299
BIBLIOGRAPHY 300
Tarski, Alfred. 1981. The Collected Works of Alfred Tarski, vol. I–IV.
Basel: Birkhäuser.
304
305