A Short Excursion to Cantor’s Paradise

August 18, 2012 Leave a comment

What follows is a (possibly quite imperfect, but if so it’s only my fault) translation of Joan Bagaria’s popular exposition, originally written in Catalan, Una petita excursió al paradís de Cantor, posted here with the author’s permission.

A Short Excursion to Cantor’s Paradise

Joan Bagaria

1. Introduction.

On 4 June 1925, at a meeting of the Mathematical Society of Westphalia in honor of Karl Weierstrass, David Hilbert (1862-1943) gave his famous lecture On the infinite. In it, after recalling the crucial role of Weierstrass in the elimination of the vague ideas derived from the notion of infinitessimal and the consequent creation of a solid foundation for the calculus, he reminds that the infinite still plays a key role in mathematical analysis. Indeed, the infinite already appears in the definition of real numbers as infinite convergent sequences of rational numbers, and in the very notion of the set of real numbers. For Hilbert, to complete the foundation of mathematical analysis it is necessary to clarify the notion of actual infinity. In fact, Hilbert says:

The definitive clarification of the nature of the infinite, instead of pertaining just to the sphere of specialized scientific interests, is needed for the dignity of the human intellect itself.

In mathematical analysis the infinite appears as a limit notion, that is, the potential infinite. But this is not the real infinite, the infinite that we have when we consider, for example, the totality of natural numbers, or the totality of points of a line as a given and complete object. This is what we call the actual infinite. The mathematical discipline that deals with the actual infinite is set theory, which has its origins in the theory of transfinite numbers of Georg Cantor (1845-1918). This theory is, says Hilbert:

…the most admirable flower of the mathematical intellect and in general one of the highest achievements of purely rational human activity.

Here we are going to explore some of the most fascinating features of set theory. We will start with its origins in Cantor’s work, and its axiomatic formulation of Zermelo-Fraenkel with the axiom of choice. We will see how all mathematical objects: numbers, functions, groups, topological spaces, etc. are, in fact, sets, and how all mathematical activity of theorem proving can be reduced, formally, to making deductions in first-order logic from the axioms of set theory. This way set theory constitutes the standard model of mathematics. This allows us to make metamathematics, that is, to study mathematically mathematics itself. Set theory is, primarily, a theory of the infinite sets. We will thus see how far the succession of Cantor’s infinite numbers, the alephs, can go, from the first infinite number \aleph_0, which measures the quantity of natural numbers, to the large cardinals: the inaccessible, measurable, strong, Woodin, supercompact, etc. We will show how theorems about the real numbers can be proved by playing infinite games, and how the existence or not of large cardinals determines what can or cannot be proved. We will also see how apparently paradoxical situations can take place in mathematics, for example, how a sphere can be broken in 5 pieces and, after joining them back, have two spheres equal to the previous one. We will explain how objects can be added to the mathematical universe by Cohen’s forcing method and, this way, prove that something cannot be proved. As an example, we will explain why the famous Cantor’s Continuum Hypothesis, formulated in 1878, has not been resolved yet and why we do not know how many real numbers there are.

2. The origin of set theory. Cantor.

In 1874, Cantor proved that the set of algebraic real numbers is countable, that is, bijectable with \mathbb{N}. The algebraic real numbers are the real solutions of the equations with the form:
a_{n}x^{n} + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + ... + a_{1}x + a_{0} = 0
where the coefficients a_i are integers. Given an equation like the previous one, let its index be the natural number
|a_{n}| + |a_{n-1}| + |a_{n-2}| + ... + |a_{1}| + |a_{0}| + n
It is clear that for every k there is only a finite number of equations of index k. For example there is only one equation of index 2: x = 0, which has 0 as a solution. Four equations of index 3: 2x = 0, x + 1 = 0, x - 1= 0 and x^2 = 0, which have as solutions 0, 1 and -1, etc. Therefore, we can enumerate all algebraic numbers according to the index of the equation of which they are solutions, and, for a single equation, in increasing order.

Hence, the sets of natural numbers, integers, rational and algebraic numbers, are all countable and, consequently, mutually bijectable.

On the other hand, and surprisingly, Cantor showed that the set of points in the interval [0,1] cannot be enumerated. Let us see a simple proof: suppose that [0,1] and \mathbb{N} were bijectable. Each x of [0,1] has a binary representation, that is, a function f:\mathbb{N} \longrightarrow \{0,1\} such that
x = \sum\limits_{n \in \mathbb{N}} f(n) 2^{-(n+1)}
Since each x of [0,1] has at most two different binary representations, we would have a bijection F:\mathbb{N} \longrightarrow \{0,1\}^{\mathbb{N}}. Now let f:\mathbb{N} \longrightarrow \{0,1\} such that
f(n) = 0 iff F(n)(n) = 1
Then, f \neq F(n) for every n. But this is impossible, because every function f:\mathbb{N} \longrightarrow \{0,1\} belongs to the range of F.

Cantor also proved that \mathbb{R}, \mathbb{R}^n, and even \mathbb{R}^\mathbb{N} are mutually bijectable and with [0,1]. I see it, but I don’t believe it! Cantor wrote. All these sets are, hence, nondenumerable.

This way, Cantor, for the first time in history, discovered that there were at least two kinds of infinity: the countable infinite, and the uncountable infinite.

From 1878 to 1883 Cantor published a series of works that constitute the origin of set theory. In them he introduces the notion of transfinite, or infinite, ordinal. When we count with natural numbers, what we do is assign a number to each object we are counting, and we do this in an ordered way, respecting the order of natural numbers. When we finish counting, we obtain one number, the last one, that tells us how many objects we have. But by counting we are also ordering the objects that we count according to the order of natural numbers. Therefore, the natural numbers are at the same time cardinal numbers, that is, they denote a quantity, and ordinal numbers, that is, they denote order. But if we want to count real numbers, we will have exhausted all natural numbers before counting them all. Nothing precludes, however, from introducing ordinal infinite numbers and continue counting. For this, we need an ordinal number indicating the order that is obtained when natural numbers are finished. Cantor called this number \omega. Ordinal numbers start then, according to Cantor, like this:

1, 2, 3, 4, 5, 6, 7, ......, \omega, \omega+1, \omega+2, \omega+3, ......, \omega \cdot 2, \omega \cdot 2+1, \omega \cdot 2+2, ......,
\omega \cdot 3, \omega \cdot 3+1, \omega \cdot 3+2, ......, \omega^2, \omega^2+1, ......, \omega^2 + \omega, ......, \omega^2 + \omega \cdot 2, ......, \omega^2 \cdot 2, ......,
\omega^2 \cdot 2 + \omega, ......, \omega^3, ......, \omega^\omega, ......, \omega^{\omega^\omega}, etc.

We now have available more ordinals to continue counting after the natural numbers have been exhausted. An important observation, however, must be made: in the case of infinite numbers the notion of ordinal and cardinal do not coincide. For example, \omega and \omega+1 are different ordinals, but the same cardinal corresponds to both of them: \omega and \omega+1 are both denumerable, in the sense that if we have two sets A and B such that A has \omega elements and B has \omega+1, then A and B are bijectable with \mathbb{N}. Actually, each one of the infinite ordinals we have described is countable and, therefore, the same cardinal, the cardinal of \mathbb{N}, which Cantor represented by \aleph_0, corresponds to them.

Cantor called numbers of the second class those infinite ordinal numbers obtained from \omega by means of the processes of adding one unit and passing to the limit, and such that the set of all their predecessors is a countable set. All of the infinite ordinals described above are numbers of the second class.

Cantor discovered, however, that the set of numbers of the second class is not countable and that, in fact, the cardinal that corresponds to it is the next infinite cardinal, \aleph_1. Thus we have the first two infinite cardinals: \aleph_0, which is the cardinal of \mathbb{N} and coincides with the ordinal \omega, and \aleph_1, which is the cardinal of the set of numbers of the second class.

This process can be repeated over and over again, that is, we can consider the ordinal corresponding to the set of ordinals of the second class, that Cantor represented by \omega_1, and we can define the ordinals of the third class as those obtained by the processes of adding one unit and passing to the limit, and such that the set of their predecessors is of cardinality \aleph_1, that is, bijectable with the set of numbers of the second class. To the set of ordinals of the third class corresponds the next infinite cardinal \aleph_2, etc. This way are obtained the transfinite cardinals:

\aleph_0, \aleph_1, \aleph_2, \aleph_3, ......, \aleph_\omega, \aleph_{\omega+1}, ......, \aleph_{\omega_1}, ......, \aleph_{\omega_\omega}, ......, \aleph_{\omega_{\omega_\omega}}, ......

where for each ordinal \alpha we have a cardinal \aleph_\alpha.

In 1885 Cantor suffered his first mental crisis, due mostly to the attacks of Leopold Kronecker (1823-1891) to the new theory of sets, and to his attempts to forestall the publication of Cantor’s works. To this crisis also contributed, doubtlessly, Cantor’s unsuccessful endeavor to resolve the Continuum Hypothesis, a hypothesis he had formulated as soon as 1878, and which says that every infinite set of real numbers is either enumerable or bijectable with the continuum \mathbb{R}. As a previous step, Cantor had conjectured and tried to prove that every set, and particularly the set of real numbers, can be well-ordered, that is, it can be totally ordered in such a way that every non-empty subset has a minimal element. This is known as the Well-Ordering Principle. Every well-ordered set can be counted, that is, it can be put in one-to-one correspondence with an initial segment of the ordinal numbers. If we accept this principle, the Continuum Hypothesis says that the set of ordinals less than \omega_1 is bijectable with the set of real numbers or, in other words, that the cardinality of \mathbb{R} is the least possible, that is, \aleph_1. Hilbert put the Continuum Hypothesis in the first place of his famous list of problems of 1900.

3. The paradoxes.

In 1895 Cantor published the first volume of his work Contributions to the founding of the theory of transfinite numbers, and the second in 1897, where he develops the general theory of sets and expounds all the theory of transfinite ordinals and cardinals. But the same year 1897 Cesare Burali-Forti (1861-1931) published a paradox of the theory of sets, a paradox that Cantor had also discovered. It appears at considering the set ON of all ordinals. To this set corresponds an ordinal number which should be a member of ON and at the same time greater than all the members of ON. In a letter to Dedekind in 1899, Cantor solves the paradox by making a distinction between what he calls consistent multiplicities, which are sets, and inconsistent multiplicities, like the class of all ordinals, which lead to inconsistencies or paradoxes if they are considered as a set.

That the notion of arbitrary set was problematic was revealed in 1902 by the famous paradox of Bertrand Russell (1872-1970): Let A be the set of all sets which are not member of themselves. Then, if A belongs to A, it does not belong to A, and if it does not belong to A, then A belongs to A. The conclusion is that A cannot then be a set.

It follows that the set of all ordinals, the set of all sets that are not members of themselves, or the universal set, or the set of all sets, do not exist!

Clearly, then, it was necessary to establish in a precise manner which are the criteria of formation of arbitrary sets. An axiomatization of the theory of sets was needed in order to avoid the paradoxes or contradictions derived from considering inconsistent multiplicities as sets. No one shall drive us out of the paradise which Cantor has created for us, said Hilbert in his 1925 conference.

4. The axiomatic theory of sets.

The first axiomatization of the theory of sets was due to Ernst Zermelo (1871-1953) and published in 1908. Zermelo had already proved Cantor’s Well-Ordering Principle using what is now known as the Axiom of Choice, an axiom which, actually, is equivalent to the Well-Ordering Principle.

Zermelo’s axiomatization was subsequently enhanced by A. Thoralf Skolem (1867-1963) and A. Abraham H. Fraenkel (1891-1965) and has become what is known as the theory of sets of Zermelo-Fraenkel (ZF) with the Axiom of Choice, or ZFC.

The axioms of ZFC can be seen as a description of of the universe of all pure sets. This universe forms a well-ordered and cumulative hierarchy, indexed by the ordinal numbers, starting with 0. That is, let

V_0 = \emptyset, the empty set.
V_{\alpha + 1} = {\mathcal P}(V_\alpha), the set of all subsets of V_\alpha.
V_\lambda = \bigcup \limits_{\beta < \lambda} V_\beta, the union of all V_\beta, \beta less than \lambda, if \lambda is a limit ordinal.

The universe of all sets is the union of all V_\alpha:

V = \bigcup \limits_{\alpha \in \textbf{ON}} V_\alpha.

The axioms of ZFC are the following:

(1) Extensionality. If two sets have the same elements, they are equal.
(2) Power Set. Given a set x, there is a set {\mathcal P}(x) which has as members all the subsets of x.
(3) Union. Given a set x, there is a set \bigcup x which has as members all the members of the members of x.
(4) Infinity. There is an infinite set.
(5) Axiom of Substitution. Every definable function which has a set as its domain, has also a set as its range.
(6) Regularity. Every set belongs to V.
(7) Axiom of Choice (AC). The Well-Ordering Principle, that is, every set can be well-ordered.

The axioms of ZFC are specially designed to build V. For example, the axiom of substitution serves to be able to continue in the limit \lambda stages, that is, to put together in one set all the V_\beta, \beta less than \lambda, in order to be able to apply the union axiom and this way obtain V_\lambda. In ZFC, the ordinals are identified, according to the definition of John Von Neumann (1903-1957), with the set of all their predecessors. So, 0=\emptyset, 1 = \{0\}, …, n+1=n \cup \{n\}, …, \omega=\{0,1,2,3,...\}, \alpha + 1 = \alpha \cup \{\alpha\}, etc. The cardinal numbers are those ordinals that are not bijectable with any of their elements. So, \aleph_0 = \omega, \aleph_1 = \omega_1, …, \aleph_n = \omega_n, …, \aleph_\omega = \bigcup_n \aleph_n, etc. AC states that every set A can be well-ordered and, therefore, is bijectable with a cardinal.

In axiomatic set theory, these axioms are formulated in a precise way, using the formal language of first-order logic with one non-logical symbol only, the symbol \in, which represents the membership relation. For example, in this formal language, the axiom of extensionality is written like this:

\forall x \forall y (\forall z(z \in x \leftrightarrow z \in y) \to x=y)

This formalization is particularly necessary to be able to formulate correctly the axioms of substitution, since we need to precise first what we mean by definable. What we understand is: definable by a formula of the formal language. Thus, we have an infinity of axioms of substitution, one for each formula of the formal language.

5. Set theory and the foundation of mathematics.

ZFC enables not only to develop all the theory of transfinite numbers of Cantor, but also, as a theory maximally general and abstract, based in the fundamental notion of arbitrary set, to develop all of classical mathematics. This means that all mathematical objects can be seen as sets, and that all mathematical theorems can be proved from ZFC by means of the usual logical derivation rules, which in turn are formalizable in the formal language of set theory.

The natural numbers are the finite ordinals. The binary relations in a set are identified with sets of ordered pairs of members of that set. For example, the relation of divisibility between natural numbers is identified with the set of all ordered pairs \langle m,n \rangle of natural numbers such that m divides n. The ordered pair \langle m,n \rangle is nothing more than the set \{\{m\},\{m,n\}\}. The integers are built as equivalence classes of ordered pairs of natural numbers, and the rationals as equivalence classes of ordered pairs of integers. The real numbers are just Dedekind cuts, which are non-empty sets of rationals bounded and closed under predecessors. The n-ary functions are only a special kind of n+1-ary relations where we identify the function f with the set of ordered n+1-tuples \langle x_1,...,x_n,y \rangle such that f(\langle x_1,...,x_n \rangle)=y. A topological space X is merely a set with a topology, which is a set of subsets of X closed under arbitrary unions and finite intersections, etc.

Let us take any classical theorem, for example, Bolzano’s theorem which says that if a < b are real numbers, f:[a,b] \to \mathbb{R} is continuous, and f(a)<0<f(b), then there is a c \in [a,b] such that f(c)=0. We have just observed that all mathematical objects appearing in this theorem can be seen as sets, and the basic properties of these objects can be proved formally from ZFC, for example, the continuity of real numbers. Hence, the usual proof of the theorem can be seen as a formal proof, through the application of the rules of first-order logic, starting from ZFC. Of course, to write a complete proof using the formal language is both difficult and unwise for practical reasons, since the proof would be not only extremely long but also intuitively incomprehensible. What matters here, however, is to see that this is, in principle, possible.

The fact that all classical mathematics can be formulated and developed within the ZFC axiomatic set theory makes possible metamathematics, that is, the mathematical study of mathematics itself. Questions as proving mathematically the possibility or impossibility to prove or not a mathematical statement make now full sense, if we understand by mathematical statement a formal statement of set theory and as a mathematical proof a formal proof from ZFC.

6. Undecidable statements. Gödel’s incompleteness theorems.

Let \varphi be a mathematical statement. We can ask ourselves whether \varphi is true or not. But in mathematics, true means provable. The truth of a statement is established by means of a proof from basic principles or axioms. How can we, then, know if \varphi is provable? If there exists a proof of \varphi, then we will have to find it. But, what if it does not exist? How could we know that there is no proof of \varphi? One way would be finding a proof of the negation of \varphi. This would tell us that there cannot be any proof of \varphi, unless mathematics was inconsistent! But, what if the negation of \varphi was not provable either? Then \varphi would be undecidable or independent of the axioms.

Gödel proved in 1931 his famous incompleteness theorems. The first of them says that in any consistent formal system allowing to develop basic arithmetic there are undecidable statements, statements such that neither they nor their negations are provable in the system. In particular, there are statements in the formal language of set theory that are neither provable nor refutable from the axioms of ZFC, assuming, of course, that ZFC is consistent.

Is ZFC consistent? The second Gödel’s incompleteness theorem, even more surprising than the first one, tells us that in any consistent formal system allowing to develop basic arithmetic, one undecidable statement is just the one that states the system’s own consistency. That means that if ZFC is consistent, then we cannot prove its consistency from the axioms of ZFC. The statement that asserts the consistency of ZFC, let us call it CON(ZFC), is the statement:

0 = 1 is not provable in ZFC

This is an arithmetic statement, since it says that the succession of symbols 0=1 is not the last step of any proof from ZFC, and symbols, statements and proofs alike can be easily codified by finite sequences of natural numbers. The conclusion is then that, in any axiomatic formal theory that contains ZFC and is consistent, there will always be undecidable arithmetic statements.

ZFC is nowadays accepted as the standard formal system in which mathematics is developed. This means that a mathematical statement is a theorem, it is true, if it is provable in ZFC. But what happens with undecidable statements? If all undecidable statements where of the kind of CON(ZFC), this would not worry us much, since they do not seem to affect directly the kind of mathematical problems one usually considers. But, for better or worse, this is not the case. As we will see below, there are many natural mathematical statements that are not decidable in ZFC.

Given a mathematical statement, we have, hence, three possibilities: that it is provable in ZFC, that its negation is provable in ZFC or that it is undecidable in ZFC. But, how could we manage to see that a mathematical statement is undecidable? This is possible due to the fact that ZFC is a first-order formal theory, and that Godel’s first-order logic completeness theorem tells us that to see that a statement \varphi is not provable in ZFC we only need to find a model of ZFC where \varphi is false. A model M of ZFC is an ordered pair \langle M,E_M \rangle where M is a non-empty set, E_M is a binary relation in M, and all the axioms of ZFC are true in \langle M,E_M \rangle, interpreting the elements of M as sets and E_M as the membership relation. So, if we could find models M and N of ZFC such that \varphi is true in M and false in N, this would mean that \varphi is undecidable.

Note that one consequence of Gödel’s second incompleteness theorem is that it is not possible to prove in ZFC the existence of a model of ZFC, for the existence of a model of ZFC is equivalent to the consistency of ZFC. Hence, when we talk about models of ZFC we are assuming that ZFC is consistent. The completeness theorem states, precisely, that ZFC is consistent if and only if it has a model.

One of the most surprising results of the mathematics of the 20th century is that Cantor’s Continuum Hypothesis is undecidable.

7. The Continuum Hypothesis I.

Let us recall that the Continuum Hypothesis says that every infinite set of real numbers is either enumerable or bijectable with \mathbb{R}. In ZFC, since we have the Well-Ordering principle as an axiom, the Continuum Hypothesis is equivalent to state that the cardinality of \mathbb{R} is \aleph_1.

In 1938, Gödel constructed a model of ZFC where the Continuum Hypothesis is valid. This is the so called Gödel’s Constructible Universe and is represented by the letter L. L is constructed like V, but just as to pass from V_\alpha to V_{\alpha+1} we took all the subsets of V_\alpha, now to pass from L_\alpha to L_{\alpha+1} we only take those subsets of L_\alpha that are definable in L_\alpha. To construct L we do not need to make use of the Axiom of Choice (AC), but once constructed it can be verified that AC is valid in L. This shows, then, that if ZF is consistent, then ZFC is also consistent, and that, hence, we can make use of AC without fear of introducing any contradiction in ZF. That the Continuum Hypothesis is valid in L is due to the fact that to L only belong the strictly necessary sets, and, hence, being the smallest possible model of ZFC, in it the cardinality of \mathbb{R} is the smallest possible, \aleph_1.

Since the Continuum Hypothesis is valid in L, and L is a model of ZFC, this means that it cannot be proved in ZFC the negation of the Continuum Hypothesis, for if this was possible, then the Continuum Hypothesis would be false in all models of ZFC.

After Gödel’s result, and in view of the various fruitless attempts to prove the Continuum Hypothesis in ZFC, the idea that maybe it was undecidable started to gain traction. What was necessary was to find a way to construct a model of ZFC where the Continuum Hypothesis was false. This was attained 25 years later by Paul J. Cohen (1934-2007) by means of a revolutionary method, forcing, for which the Fields Medal was awarded to him.

8. Forcing.

The forcing technique, invented by Cohen, is an extraordinarily flexible method of construction of models of ZFC. It allows to construct models with the most diverse properties and with a great control over the statements that will end up being true in the model that is being constructed, showing thus the consistency and undecidability of many mathematical statements.

Similarly to how we pass from a field K to an algebraic extension K[a], for example, from the field of rationals \mathbb{Q} to the field \mathbb{Q}[\sqrt{2}], we pass from a model M of ZFC to a forcing extension M[G]. Despite the analogy, however, the differences are enormous. The forcing method is of an extraordinary sophistication and technical complexity, in which combinatorial or topological, logical and metamathematical aspects alike intervene.

To give a brief idea, let us consider the case in which we want to obtain, starting from a model M, a forcing extension where the Continuum Hypothesis is false. We will have to add to M at least \aleph_2 new real numbers. The first problem that we meet is that, since M is a model of ZFC, we cannot prove in ZFC that there are any real numbers that are not yet in M. Where are we going to take those new reals we want to add from? To simplify the problem, let us suppose by now that we want to add only one new real, and to simplify it even more, that what we want to add is the binary representation, that is, an infinite sequence of zeroes and ones. The idea is to work with the set of finite approximations to the infinite sequence of zeroes and ones that we want to add. Ordering the finite sequences in a natural way what we have is the full binary tree, and the infinite sequence that we want to add is just an infinite branch of that tree. The question is to find a branch that does not belong to M. This is going to happen if that branch is what is called generic over M, which means that it intersects all open dense subsets of the binary tree, with the order topology, that belong to M. But how could we find a generic branch over M? The answer is that there are going to be if M is countable! But how could M be countable if it is a model of ZFC, and in ZFC we have shown that there are uncountable sets, like the set of real numbers? The answer is given by the Löwenheim–Skolem theorem, which says that for every model M of ZFC there is another one N \subseteq M which is countable and satisfies exactly the same statements. Hence, we can assume that M is countable, and take a branch b of the binary tree that is generic over M. The resulting model M[b] continues to be a model of ZFC and contains the same ordinals as M. To verify all of this is not trivial at all, and to be able to do it one needs the forcing relation, represented by \Vdash, between the approximations to the generic branch b and the formulas of the language of set theory with names for the elements of the generic extension M[b]. The Forcing Theorem says that a statement \varphi is valid in M[b] if and only if there is a finite approximation p \subseteq b such that p \Vdash \varphi. But to make the Continuum Hypothesis fail, we need to add at least \aleph_2 new generic branches over M. This is done in a similar way, but simultaneously with \aleph_2 copies of the full binary tree. There is one last difficulty, and it is that we have to make sure that when we add the new \aleph_2 branches, the cardinal \aleph_2 of M will continue to be the \aleph_2 of the generic extension. It could happen, and in fact it happens very frequently, that some cardinals of the model M are not yet cardinals of the generic extension. Fortunately, in this case it does not happen due to the topological properties of the binary tree. But to verify this, one needs once again to make us of the forcing relation \Vdash.

The Continuum Hypothesis is valid in Gödel’s model L, and is not valid in Cohen’s model. That is, it is independent of ZFC.

The forcing technique allows to add to a countable model M of ZFC practically any kind of object. Moreover, it can be iterated, thus obtaining subsequent generic extensions. This way models of ZFC with very diverse properties can be constructed, allowing to prove the independence of ZFC of a large number of mathematical statements. The host of independence results obtained by means of forcing in the seventies and the seventies revealed the limitations of ZFC to give answers to many fundamental mathematical questions. It was necessary, then, to find new axioms which, added to ZFC, would allow to resolve all these questions.

9. Large cardinals.

As we have already seen at the beginning, the universe of all sets V is defined as the union of the transfinite sequence of the V_\alpha. We have yet seen that V cannot be a set, for then Russell’s paradox appears. But if V was a set, then a cardinal, \kappa, would correspond to it, which in fact would coincide with the \kappa-th cardinal, \aleph_\kappa. Moreover, V_\kappa would be a model of ZFC. Obviously we cannot prove in ZFC that there is a \kappa with these properties, for then, since V_\kappa is a model of ZFC, we would have proved in ZFC that ZFC has a model and that, therefore, it is consistent, and we have seen that this is impossible by Gödel’s second incompleteness theorem. Why do not we, then, add to ZFC the axiom that says that there is a cardinal \kappa which is the \kappa-th cardinal and such that V_\kappa is a model of ZFC?

This axiom was propounded in 1930 by Wacław Sierpiński (1882-1969) and Alfred Tarski (1902-1983), and is the first of the large cardinal axioms. A cardinal \kappa with the mentioned properties is called inaccessible.

Other axioms of existence of large cardinals, much more powerful than the inaccessibles, subsequently appeared all along the 20th century. The most important of them, the measurable cardinals, were also discovered in 1930 by Stanisław Ulam (1909-1984) (discoverer too, on the other hand, of the ignition method of the hydrogen bomb) as a consequence of his works on extensions of the Lebesgue measure.

10. Measurable sets.

Let us recall that a set of real numbers is Borel if it can be obtained from the open sets by taking complements and countable intersections. Let us recall, also, that a set of real numbers has measure zero, or is null, when for every \epsilon > 0 there is a sequence of open intervals \{I_n\}_n such that A \subseteq \bigcup_n I_n and \sum_n \lvert I_n \rvert < \epsilon. A is Lebesgue measurable if it is almost a Borel set, that is, if it differs from a Borel set by a measure zero set. To every measurable set A corresponds a number \mu (A) \in [0,\infty], its measure, which is translation invariant and \sigma-additive, that is, the measure of a countable union of measurable and pairwise disjoint sets is the sum of the measures of those sets. Moreover, the measure of an interval is its length.

In ZFC, the existence of sets of real numbers which are not Lebesgue measurable is provable. For example, the set discovered in 1905 by Giuseppe Vitali (1875-1932): let A a maximal subset of [0,1] with the property that for any different x,y \in A, x-a is not rational. The existence of A is guaranteed by the axiom of choice. In order to see that A is not measurable, for every rational p, consider the set A_p = \{x+p : x \in A \}. All these sets are pairwise disjoint. Let B the union of all the A_p, where p is a rational in the interval [-1,1]. latex A$ cannot have measure zero, for then B would have measure zero, and this is impossible because [0,1] \subseteq B. On the other hand, A cannot have positive measure, for then B would have infinite measure, and this cannot happen since B \subseteq [-1,2].

Trivially, all Borel sets are measurable, and in 1917 Lusin proved that all continuous images of Borel sets, the so called analytic sets, are also measurable. Since if a set is measurable, then so is its complement, all complements of analytic sets, the so called coanalytic sets, are also measurable. The natural question is then if the continuous images of coanalytic sets, the so called \Sigma^1_2 sets, are also measurable. But the answer is undecidable in ZFC! In L there are \Sigma^1_2 sets that are not measurable, and with forcing one can construct models where all \Sigma^1_2 sets are measurable.

The existence of sets that are not Lebesgue measurable is due to the fact that the Lebesgue measure is translation invariant. In fact, there can be no measure which extends the Lebesgue measure, is \sigma-additive and translation invariant and which measures all the sets of real numbers. The question if there might be any measure which extends the Lebesgue measure, \sigma-additive and which measures all subsets of some set A, is known as the measure problem. Ulam proved in 1930, that if there existed a set A with such a measure, then either the cardinity of \mathbb{R} would be immense or there would exist a cardinal \kappa called measurable, which is inaccessible, and in fact much more than inaccessible, it is the \kappa-th inaccessible, etc. The existence of measurable cardinals is, therefore, not provable in ZFC, even if we add the axiom of existence of inaccessible cardinals.

11. More large cardinals.

After the independence results arising from the invention of forcing, it was attempted to see if large cardinal axioms allowed to decide some of the questions undecidable in ZFC. Soon it was seen that large cardinal axioms did not allow to say anything new about the cardinality of \mathbb{R}, and thus did not resolve the Continuum Hypothesis. However, surprisingly, Robert M. Solovay proved in 1969 that if there is a measurable cardinal, then all \Sigma^1_2 sets of reals are Lebesgue measurable. What is surprising is that measurable cardinals, so far away from the real numbers in V, have so much influence over the basic properties of real numbers.

In the seventies new large cardinal axioms were discovered, even much stronger than the axiom of the existence of measurable cardinals. Curiously, the existence of a measurable cardinal is equivalent to the existence of an elementary embedding (different from identity) of V in the subuniverse M. That is, a function

j: V \longrightarrow M

which respects every statement, i. e. if x is a set and P is a property definable in the formal language of set theory, then x has the property P in V if and only if j(x) has the property P in M. The measurable cardinal is the so called critical point of the embedding, that is the first ordinal \kappa such that j(\kappa) \neq \kappa. A famous result of Kenneth Kunen in 1971 says that there cannot be any elementary embedding of V into V. But demanding that M should be very similar to V stronger and stronger large cardinal axioms are obtained. For example, if we demand that for each ordinal \alpha there is an elementary embedding j: V \longrightarrow M with critical point \kappa and with V_\alpha \subseteq M, \kappa is a so called strong cardinal. If we demand that for each \alpha there is an elementary embedding j: V \longrightarrow M with critical point \kappa and with every sequence of length \alpha of elements of V contained in M, \kappa is a so called supercompact cardinal, etc. The existence of a supercompact entails the existence of many smaller strong cardinals, and the existence of a strong cardinal the existence of many smaller measurable cardinals.

Which effects have the existence of very large cardinals, like the supercompact, on questions like the Lebesgue measurability of sets of real numbers?

12. The Banach-Tarski Paradox.

In 1924, Stefan Banach (1892-1945) and Alfred Tarski, based on a theorem of Felix Hausdorff (1862-1943) from 1914, proved the following theorem, known as the Banach-Tarski Paradox:

There is a partition of the unit ball B of \mathbb{R}^3, in 5 pieces B_0,...,B_4, and there are isometries \rho_0,...,\rho_4:\mathbb{R}^3 \longrightarrow \mathbb{R}^3 such that

B = \rho_0[B_0] \cup \rho_1[B_1] \cup \rho_2[B_2] = \rho_3[B_3] \cup \rho_4[B_4]

In other words, it is possible to partition the unit ball in 5 pieces in such a way that after reassembling them we have two balls equal to the original (!). Actually, given any two bounded sets A and B with non-empty interior, of the three-dimensional space \mathbb{R}^3, A can be partitioned in a finite number of pieces and reassembled by rigid moves in such a way that they form B.

How is it possible that this can be done? Is this not clearly false? The reason why this can be done is that not all pieces are Lebesgue measurable! In the definition of the pieces AC is used in an essential way. The Banach-Tarski theorem is possibly the most anti-intuitive consequence of AC. Yet, this axiom continues to have a fundamental, and (almost) universally accepted by the mathematical community, role. A big part of current mathematics would be impossible without the axiom of choice. The anti-intuitive consequences of AC are due to the fact that the axiom states the existence of sets that are not explicitly definable. For example, it states that there is a well-order of \mathbb{R}, but, which is this order, is there any way to define it? AC allows us to partition B in non-measurable pieces. But how are these pieces? Can they be explicitly defined? As we are going to see, the answer is no, if supercompact cardinals exist!

13. Determinacy.

An efficient way of proving that a set of real numbers has a certain property is by playing infinite games. Given A \subseteq [0,1], we can play the following game associated to A: there are two players Adam and Eve, who play alternatively n_i \in [0,1]. Adam plays n_0, then Eve plays n_1, next Adam plays n_2, and so on. A game looks like this:

\begin{array}{c || *{12}{c}}  \bf{Adam} & n_0 & & n_2 & & n_4 & & ... & & n_{2k}& & ... & \\  \hline  \bf{Eve} & & n_1 & & n_3 & & ... & & ... & & n_{2k+1} & & ... \\  \end{array}

At the end of the game, the players have produced an infinite sequence n_0,n_1,n_2,... of zeroes and ones. Adam wins the game if

\sum \limits_{i=0}^\infty \frac{n_i}{2^{i+1}} \in A

and otherwise Eve wins.

The game is determined if one of the players has a winning strategy. That is, if there is a function f that assigns 0 or 1 to each finite sequence of zeroes and ones, in such a way that if one of the players, let us say Eve, plays according to this function, always playing f(n_0,n_1,...,n_{2k}) in the k-th turn, then she wins the game independently of what Adam plays. We say that the set A is determined if the game associated with A is determined.

In ZFC it can be proved that there are non-determined sets. But Donald Martin proved in 1975 that every Borel set is determined, and if a measurable cardinal exists, then every analytic set is determined.

14. Projective sets and descriptive set theory.

The projective sets of real numbers are those obtained from Borel sets by taking complements and continous images. The analytic, coanalytic and \sigma_2^1 sets are, hence, projective. The projective sets form a hierarchy of increasing complexity, in terms of the number of necessary steps to obtain them from the Borel sets. The study of the properties of projective sets, like their Lebesgue measurability, constitutes the core of the so called descriptive set theory. If we want to prove that the sets of real numbers have a certain property we can do it by first proving it for the Borel sets, then for the analytic, then for the conanalytic, then for the \Sigma_2^1, etc. up till proving it for all of the projective sets. Since all sets of real numbers appearing normally in mathematical practice are projective, the ability to prove that they have certain properties is of great interest. Moreover, all results about projective sets of real numbers have an immediate translation to projective sets in any polish space (separable and completely metrizable), like \mathbb{R}^n, \mathbb{C}, separable Banach spaces, etc.

In the sixties and seventies a number of results were proved showing that practically any kind of property shared by all Borel sets, is also shared by all projective sets, assuming that all projective sets are determined. Hence, the determinacy of projective sets, although not provable in ZFC, constitutes an excellent candidate as a new axiom of set theory, since it resolves practically any question about the properties of projective sets. For example, if every projective set is determined, then all projective sets are Lebesgue measurable, there is no projective well-order of the real numbers, and if they are not enumerable, then they are bijectable with \mathbb{R}.

The axiom that says that all projective sets are determined is known by the name of Projective Determinacy (PD). PD implies, then, that there is no projective counterexample to the Continuum Hypothesis, that every projective set is Lebesgue measurable, and, therefore, that the pieces in which the sphere in the Banach-Tarski theorem is partitioned cannot all of them be projective. PD is considered as an axiom at the same time natural and extremely useful to resolve questions about the projective sets. Actually, W. Hugh Woodin has proved that, in some sense, PD answers all the questions about projective sets.

One of the greatest and most surprising recent developments in set theory has been proving that PD follows from the existence of large cardinals. Donald Martin and John Steel proved in 1988 that if there are infinitely many Woodin cardinals, which are between the strong and the supercompact, then PD is valid. Furthermore, W. Hugh Woodin proved that if there are infinitely many Woodin cardinals with a measurable above them all, then not only the projective sets, but all the sets of reals which are definable from real numbers and ordinals are determined. And he also proved that the existence of infinite Woodin cardinals is a necessary hypothesis to obtain the consistency of PD. These results show that large cardinal axioms, at least the existence of infinitely many Woodin cardinals, are axioms both natural and reasonable, which decide most of the questions relative to definable set of real numbers.

Despite the enormous success of large cardinal axioms to decide questions about the definable sets of real numbers, and many other questions in other areas of mathematics, there are many fundamental questions that large cardinal axioms cannot resolve. The most important of them is the Continuum Hypothesis.

15. Forcing axioms.

Besides the Continuum Hypothesis and the measure problem, another key problem for the development of set theory has been Suslin’s Hypothesis. Cantor had proved that any linear order S which is dense, without extreme points, separable and complete is isomorphic to the real line. Mikhail Y. Suslin (1894-1919) conjectured that if instead of assuming that S is separable we assume that it is ccc, that is, that any disjoint collection of open intervals of S is countable, then S is also isomorphic to \mathbb{R}. Ronald Jensen proved in 1968 that Suslin’s Hypothesis fails in L. On the other hand, Robert M. Solovay and Stanley Tennenbaum constructed in 1971 a model with forcing where Suslin’s Hypothesis is valid, showing, thus, that it is undecidable in ZFC.

Based on the construction of the model of Solovay and Tennenbaum where Suslin’s Hypothesis is valid, Donald A. Martin isolated a new principle known as Martin’s Axiom (MA). This principle is equivalent to the following natural generalization of the Baire Category Theorem, and says that:

In every space which is compact, Hausdorff and ccc, the intersection of \aleph_1 open dense subsets is dense.

It can easily be seen that MA implies the negation of the Continuum Hypothesis. MA has been used with great success to resolve many questions undecidable in ZFC. For example, it implies Suslin’s Hypothesis and that every \Sigma_2^1 set of real numbers is Lebesgue measurable. Even so, MA has been considered as a useful principle to prove that certain statements were consistent with ZFC, but not as an axiom.

As a culmination of a number of stronger and stronger generalizations of MA introduced in the seventies and eighties, Matthew Foreman, Menachem Magidor and Saharon Shelah discovered in 1988 the strongest possible version of MA, the so called Martin’s Maximum Axiom (MM), and proved that it was consistent with ZFC assuming that the existence of a supercompact cardinal was also consistent. MM is formulated like MA but for spaces which have a property called of preservation of stationary sets. The most surprising and interesting is that they proved that MM implies that the cardinality of \mathbb{R} is \aleph_2!

The question is: to what extent MA and MM are natural axioms of set theory and not only principles useful to obtain results of consistency with ZFC? For a start, they do not seem at all evident or natural principles, in the same sense as large cardinal axioms do.

16. Generic absoluteness.

The forcing method allows to construct extensions of countable models. The extensions that are obtained depend on a partial ordering \mathbb{P}, which is the partial order of the approximations to the new, or generic, object that we want to add. But we can also consider forcing, or generic, extensions of V. Clearly, these are ideal extensions, since all sets are already in V. Given a partial order \mathbb{P}, we can define the Boolean Model V^\mathbb{P}, which represents the model obtained if there was a generic extension of V obtained by means of \mathbb{P}.

Given an infinite cardinal \kappa, let H(\kappa) the set of all sets hereditarily of cardinality less than \kappa, that is, the set of all sets X such that X, the elements of X, the elements of elements of X, etc. all of them have cardinality less than \kappa. Hence, H(\omega) is the set of all hereditarily finite sets, and, therefore, H(\omega) = V_\omega. H(\omega_1) is the set of all hereditarily countable sets. We always have H(\kappa) \subseteq V_\kappa, but not the other way round. If \kappa is inaccessible, then H(\kappa) = V_\kappa. The H(\kappa) also form a cumulative hierarchy, in the same way as the V_\kappa do, and the union of all the H(\kappa) is V.

There always is a certain degree of absoluteness between V^\mathbb{P} and V. For example, by the Levy-Shoenfield Absoluteness Theorem we have that given an existential statement talking about sets in H(\omega_1), if it could be made valid in a generic extension, then it would also be valid in V. In other words, if by forcing we could make valid a statement that says that there is a set X with certain properties which depend exclusively of sets in H(\omega_1), then such a set X already exists.

Recall that the axiom of existence of an inaccessible cardinal says that if the construction of V could be continued yet one more step beyond all ordinals, then the cardinal that would be obtained, which would be an inaccessible, exists. Generalizing in a similar fashion the Levy-Shoenfield absoluteness theorem, we would say that every existential statement talking about sets in H(\omega_2) and valid in V^\mathbb{P}, is also valid in V. That is, if by means of forcing we could make valid a statement that says that there is a set X with certain properties which depend exclusively of sets in H(\omega_2), then such a set X already exists. Let us represent this assertion by

H(\omega_2) \prec_\exists V^\mathbb{P}

This generalization would then be a new natural axiom of set theory, if it was consistent. Unfortunately, it is not, unless we restrict the kind of generic extensions.

Now, it turns out that MA is equivalent to H(\omega_2) \prec_\exists V^\mathbb{P} for every \mathbb{P} which is ccc, that is, for every \mathbb{P} in which every set of pairwise incompatible elements is countable. Therefore, MA is indeed a natural axiom of set theory. If we formulate the axiom corresponding to MM, that is, H(\omega_2) \prec_\exists V^\mathbb{P} for every \mathbb{P} with the stationary set preservation property, then we obtain the so-called Bounded Martin’s Maximum Axiom (BMM).

BMM is consistent with ZFC, assuming that the existence of a supercompact cardinal is consistent. In contrast to MM, BMM is a natural axiom of set theory, of a similar kind to large cardinal axioms, but which, unlike those, decides many questions that large cardinal axioms do not resolve. As a culmination of a number of results by W. Hugh Woodin and David Asperó, Stevo Todorčević proved, only some months ago, that BMM implies that the cardinality of \mathbb{R} is \aleph_2. This is a remarkable result, since, for the first time, a natural extension of ZFC decides the cardinality of the continuum.

17. The Continuum Hypothesis II.

To conclude, we will briefly mention, and very informally, some recent results by Woodin.

The first result says that if there is a proper class of Woodin cardinals, that is, if for every ordinal there is a higher Woodin cardinal, then the theory of H(\omega_1) is generically absolute. This means that we cannot change the theory of the real numbers with forcing, and implies, for example, that every definable set of real numbers is Lebesgue measurable.

The second result says that there is an axiom, called (\ast), which is a generalization of BMM, that practically decides every question about H(\omega_2), in such a way that the theory of H(\omega_2) is generically absolute. The (\ast) axiom is the equivalent to PD, but for H(\omega_2). The axiom (\ast) implies that the cardinality of the continuum is \aleph_2.

Finally, Woodin proved in 2000 that if there is there is a proper class of Woodin cardinals, then any axiom which decides practically every question about H(\omega_2) in such a way that the theory of H(\omega_2) is generically absolute, necessarily must imply the negation of the Continuum Hypothesis.

The notion of practically every question about H(\omega_2), to be precisely formulated, requires the introduction of a new logic, the so-called Woodin’s \Omega-logic$. The most important problem pending to be resolved about the Continuum Hypothesis is the so-called \Omega-conjecture. The formulation of this conjecture is of a highly technical character. The conclusion is if the \Omega-conjecture is true, then any set theory compatible with the existence of large cardinals and which makes the theory of H(\omega_2) generically absolute, must necessarily imply the negation of the Continuum Hypothesis.

Hence, if the \Omega-conjecture is true, then the Continuum Hypothesis is, in a very natural sense, false. The theories: ZFC plus the Continuum Hypothesis and ZFC plus its negation are not equally reasonable. Woodin’s results show that, in the presence of large cardinals, if we accept the Continuum hypothesis, then we cannot answer many questions about H(\omega_2), and, henceforth, about sets of real numbers.

18. Concluding remarks.

In this little excursion to Cantor’s paradise, we have seen that set theory has experienced through the twentieth century an extraordinary development. What started as a mathematical theory of transfinite numbers, has evolved to constitute a general theory of infinite sets and a foundation for all mathematics. The techniques developed by set theory, like forcing, infinite combinatorics, the theory of large cardinals, etc. have turned it into a discipline of high complexity and beauty, with fascinating results which stimulate and at the same time defy imagination. 125 years after its formulation by Cantor, and 103 years after Hilber put it in the first place of his list, of problems for the twentieth century, the Continuum Hypothesis will keep on being, undoubtedly, one of the most important problems of the mathematics of the twenty-first century.

The reflection about fundamental mathematical questions has generated the theory of sets as we know it today. The very fact that it has been possible to unify all classical mathematics into one single maximally general theory, is a remarkable milestone. But, beyond any of this, the richness, originality and power of the ideas generated in set theory, make them applicable to almost all the areas of mathematics. With increasing frequency, forcing, large cardinals, axioms of determinacy, descriptive set theory, infinitary combinatorics, etc. find applications in the resolution of problems that had been open for many years in areas such as Topology, Algebra, Real and Complex Analysis, Functional Analysis, Measure Theory, etc. In the twenty-first century, the ideas recently introduced by Shelah, Todorcevic and, most importantly, Woodin, among others, not only will contribute to the resolution of many mathematical problems, already known or new ones that will emerge, but also will enhance substantially, as set theory has already done through the twentieth century, our understanding of the mathematical universe.

“What is Mathematics?”

February 1, 2012 Leave a comment

It might be a little annoying that a description of mathematical activity that satisfies everyone has not yet been found. People have been doing, teaching and learning mathematics for centuries, in schools, colleges and universities throughout the world. Math is something there for all to see. So, what is being asked for? Moreover, what could be learnt from any such description, given that it is of something already known? There are several reasons for this situation.

Usually, when such a kind of question in relation to some activity (medicine, for example) is asked, there is behind it some concern with a potential or actual source of professional intrusion, that is, a concern with some distortion undermining honest professional practice. I think that this is, in part, the motivation of some of Frank Quinn’s writings (see Suggested Readings). Quinn worries that “a century of difficult adaptation to precision might be eroded by a new wave of heuristic work”. According to this author, heuristics may play their role in “mathematical science”, but they should not be used to justify results in “core mathematics”, because results so obtained cannot, and in fact must not, be considered reliable. In contrast, another author, Carlo Cellucci, defends that mathematics cannot be understood without the ‘analytic method’, which makes essential use of heuristics, or rules of discovery (see the Suggested Readings, below). So there is a demarcation problem here. “What is mathematics?” stands for “Which means are to be considered mathematical and which not? Should some open-endedness be allowed so that as new means are discovered they can be added to the collection of those already known? How would one then decide whether candidate means deserve the right to join the previously accepted ones?”, and so on.

But it is also common that what is sought with this kind of question is to put the discipline in a broader context than the most immediate to its practice, to look at it from a different perspective, so to speak, and there are a host of possible choices of perspective (historical, sociological, philosophical, educational, … and any combinations and ramifications of these). Actually, there could be so many different expectations behind a single formulation of the question that it would be impossible that one answer could satisfy all of them at once. Gian-Carlo Rota described several senses of the question “What is mathematics?” under various circumstances in a chapter of the Lezioni Napoletane, ed. by F. Palombi, La Città del Sole, Naples 1999 (an English version of this chapter, entitled What “Is” Mathematics?, can be found here. I thank Cellucci for the reference). Rota concludes that the feeling of wonder expressed by this question is the start of a philosophical journey that will eventually disclose the ‘conditions of possibility’ of mathematics. Maybe Quinn would make the point that this ought to be a scientific journey, rather than a philosophical one. It might seem strange, however, that to understand or to explain what is mathematics one needs to offer an explanation in terms of ‘conditions of possibility’ (just imagine one had to give an account of carpentry or tailoring in such terms). This approach to mathematics springs from Kant’s transcendental idealism in relation to science and the general problem of knowledge as a response to Descartes’ methodological skepticism. But there is also a modern motivation for it. In connection with Gödel’s realism (I borrow here terms from Penelope Maddy) ‘common features of [mathematics] are taken to derive their justification’ from ‘Robust Realism’ for which ‘the justification — or lack of justification — for mathematical methods is based on a metaphysical account of its subject matter’ (Maddy [2010], p. 87). By calling for metaphysical or epistemologycal accounts, one attempts to leverage the exploration of the frontiers of incompleteness, trying to answer exactly the same questions as above but now with an entirely different purpose in mind.

It is, hence, natural that different answers exist to the question “What is Mathematics?” each of them in relation with a different purpose or throwing light on a different aspect of mathematical activity.

Suggested Readings

Cellucci, Carlo

[2011a] ‘Top-down and Bottom-Up Philosophy of Mathematics’, forthcoming in Foundations of Science.

[2011b] ‘Philosophy of Mathematics: Making a Fresh Start’, paper presented at the 14th Congress of Logic and Philosophy of Science. Nancy. July 19-26, 2011.

[2011c] ‘Indiscrete Variations on Gian-Carlo Rota’s Themes’, in M. Pitici (ed.), The Best Writings on Mathematics 2010 (pp. 311-329). Princeton: Princeton University Press.

[2008a] ‘The nature of mathematical explanation’, Studies in History and Philosophy of Science, vol. 39, pp. 202-210.

[2008b] ‘Why proof? What is a proof?’, in R. Lupacchini and G. Corsi (eds.), Deduction, computation, experiment. Exploring the effectiveness of proof (pp. 1-27). Berlin: Springer.

Maddy, Penelope

[2011] Defending the Axioms. Oxford: Oxford University Press.

Frank Quinn

A Revolution in Mathematics? What Really Happened a Century Ago and Why It Matters Today, Notices of the AMS 59 (January 2011), 31-37.

What the axioms do for us (part II)

April 26, 2011 Leave a comment

In his excellent entry for Set Theory in Gowers’ The Princeton Companion to Mathematics, Joan Bagaria writes: “Set theory […] plays two very different roles at the same time: on the one hand, it is an area of mathematics devoted to the study of abstract infinite sets and their properties; on the other, it provides mathematics with its foundation”.

It is the foundational aspect of set theory that I wish to emphasize in this post. In the previous one I presented the axioms of set theory as describing a (piece of) set theoretic universe, but that was just one side of the coin.

ZFC is an axiomatic formal system based in first-order logic. Two outstanding features of this formal system are:

1. Any mathematical concept or argument is (or at least it should be) formalizable within the system.
2. The formal theorems of the theory, once interpreted, are all of them true (what they say is “the fact of the matter”).

As in the previous post we could call this one the “orthodox view”. ZFC encapsulates the valid principles of mathematical reasoning, and hence the valid principles allowing mathematicians to find mathematical truths.

To accept a collection of axioms as a foundation of mathematics, then, means to accept those mathematical principles they encapsulate, but also to reject whatever principles their acceptance forbids. There are at least two points that deserve special attention here:

1. It is the collection of axioms as a whole that must find justification through the principles it encapsulates, not just each axiom on its own. A coherent framework arises only when they are put to work altogether.
2. It is the equilibrium between what is gained and what is lost by accepting a collection of axioms what actually matters from a foundational point of view. Few mathematicians would like to see hobbled their efforts by arbitrary restrictions. In Shelah’s words (Logical Dreams) “the best framework, the best foundation, is the one that governs you least”.

Hence accepting some axioms in place of some others is clearly more a matter of convenience than a matter of faith. Particularly, when different frameworks are available which encapsulate the very same principles other criteria such as naturalness or simplicity may intervene.

From this point of view, so called extrinsic justifications for the axioms of set theory happen to make perfect sense: to be able to know which principles are available and which are not it is inescapable to explore what follows and what does not from a chosen collection of candidate axioms.

Let us see all this through some examples that can be found in Maddy, Believing the axioms I:

Fraenkel, Bar-Hillel and Levy justify the use of an extensional notion of set “because it is simpler and clearer”, while at the same time arguing that “it can simulate intensional notions when the need arises”.

Zermelo justified Foundation because it gives a useful understanding of the universe of sets. He could also have argued that it does not prevent one to simulate non-well-founded notions of set when the need arises.

In set theory what is mathematically possible or not is formalized in terms of set existence. Pair, union, replacement, separation or even choice are all very immediate examples of what kind of “manipulations” mathematicians might expect to be able perform over given collections in order to obtain new collections. So one might expect that the more sets one can prove to exist in set theory, the more of these “manipulations” will be available. This could sort of justify a rule of thumb such as maximize that, by the way, has as a consequence, as Koellner and Woodin have proved, the attainment of a “significant reduction in incompletess”, at least at certain levels of the arithmetical hierarchy (just below CH).

Determinacy or V=L are not seen as axioms of set theory because they somehow limit the space of available mathematical possibilities (Determinacy is incompatible with choice and V=L with the existence of certain large cardinals, for example measurable cardinals).

Some mathematicians do not accept set theory as a foundation of all mathematics. For example, Andrej Bauer has said “set theory is not the foundation of the mathematics I do, and neither are the mathematical objects I study sets in the sense of classical set theory”. It would be interesting to know which parts of his mathematical activity he thinks are not formalizable within ZFC and why. But this is only one way to question the “orthodox view”.

As I noted above there are two outstanding features of ZFC. Even if we accept that any mathematical concept or argument is (or at least it should be) formalizable within the system, after what we saw in the previous post, namely, that the theorems of ZFC admit different interpretations, it is possible to argue that they cannot just be regarded as true simpliciter. In other words, that it is not clear which is exactly the link between what we think is mathematically possible and set existence theorems in ZFC. The appeal to an uncompromised platonism made by Drake and many set theorists could be then regarded as a means to solve this annoying situation; begging the question by taking what the theorems say at face value.

What the axioms do for us

March 30, 2011 Leave a comment

The purpose of this post is to give a brief account of the axioms of ZFC in the light of the cumulative hierarchy of sets and present what I think are two different, although both of them legitimate, senses in which they can be understood. I will follow mostly Frank R. Drake, Set Theory: an introduction to large cardinals, for the rather sketchy account of the first one of these views. Other similar presentations can be found, for example, in Shoenfield, Mathematical Logic or in Barwise (ed.) Handbook of Mathematical Logic.

The philosophical viewpoint taken here is, as Drake himself expresses in his preface, an uncompromising realism or platonism, under which in some sense sets do exist as objects to be studied and set theory is then just as much about fixed objects as is number theory. We will see, however, that there is more to be said about this point later. Let us first take a look at the axioms and how they are motivated by the cumulative hierarchy picture. I have freely used standard abbreviations in stating the axioms. Other versions and combinations of axioms, equivalent to these, are available in the literature. The following comments apply or can easily be adapted to any of them. Although some objections have been raised to this particular way of justifying the axioms, I will not address them here. A lot more on theses issues can be found elsewhere, for a start, in P. Maddy, Believing the Axioms, in C. Parsons, What is the iterative conception of set?, or in G. Boolos, Logic, logic, logic.

The Orthodox View

There are two axioms that demand that sets have certain properties: Extensionality and Foundation.

\forall z(z \in x \leftrightarrow z \in y) \to x=y.
Extensionality asserts that two different sets cannot have the same members in our universe. It entails that the cumulated sets obtained by succesive use of the powerset operation, although generated again and again, are always the same for us. It also entails that there is only one set without members, so that urelements are also excluded from our theory.

\exists y (y \in x) \to \exists y (y \in x \land \forall z \lnot(z \in x \land z \in y)).
Foundation asserts that every non-empty set x has a member y with no members in common with x in our universe. It entails that no infinite descending chains, x \ni y \ni ..., or cycles, x \in ... \in x, are allowed.
To see why it is true in the cumulative hierarchy picture note that since sets are generated for the very first time at some level (for some ordinal) by collecting only other sets (if any) generated at previous levels, those members of x generated for the first time at the lowest level among those levels at which the members of x are generated for the first time cannot have any member in common with x. It is also seen that if there is any set with a given property, there is a first (lowest) level at which that property comes first.

The rest of the axioms can be read as saying that enough sets exist in some direction.

\exists x (x=x).
This axiom asserts the existence of some set in our universe.
It is usually regarded as a logical axiom, since normally only non-empty domains of discourse are cosidered worth studying.

For every formula \phi with free variable x for which the variable y does not ocurr, the following is an axiom:
\exists y \forall x (x \in y \leftrightarrow x\in a \land \phi(x)).
Also called (Restricted) Comprehension, Separation and Specification, Subsets is an axiom schema that asserts that, in our universe, for every set a there is a (sub)set y whose members are the members of a satisfying the property \phi. The existence of subsets proved by this schema is hence limited to the expressiveness of the language.
To see it is true for the cumulative hierarchy, note that at any level where a is formed, so are any of its subsets.

\exists y \forall x (x \in y \leftrightarrow x = a \lor x = b).
Pair asserts the existence in our universe for any two sets a and b of a set y whose only members are a and b.
This is true in the cumulative hierarchy because at any level after both a and b have been formed, their unordered pair can be formed. It also entails that there is no highest level.

Power set
\exists y \forall x (x \in y \leftrightarrow \forall z (z \in x \to z \in a)).
Power set asserts the existence in our universe for any set a, of another set whose members are all of the subsets of a.
We earlier said that the subsets of a set are formed by the same level as it is formed, so at the immediately following level we can form its power set. We only need there is no last level.

\exists y \forall x (x \in y \leftrightarrow \exists z (x \in z \land z \in a)).
Union asserts that in our universe for any given set a there is a set whose members are all the members of the members of a.
Since members of members of a must ocurr at levels before a their union will ocurr at the same level as a, and possibly at the level before.

\exists w (\emptyset \in w \land \forall x (x \in w \to \exists z (z \in w \to \forall u (u \in z \leftrightarrow u \in x \lor u = x)))).
Infinity asserts the existence in our universe of a set whose members are \emptyset and its successors.
It can be seen as saying that we can imagine a simple infinite sequence of levels having been completed.

For every formula \psi (x,y) with x and y free, without any occurrence of b, the following is an axiom:
\forall x,y,z (\psi (x,y) \land \psi (x,z) \to y = z) \to \exists b \forall y (y \in b \leftrightarrow \exists x (x \in a \land \psi (x,y))).
Replacement says that if a formula \psi defines a class function, then the image of its restriction to any set is also a set.
This is, in terms of the cumulative hierarchy, a statement about the levels having no end (as far as the expressiveness of our language allows us to go).

\forall x (x \in z \to x \neq \emptyset \land \forall y (y \in z \to x \cap y = \emptyset \lor x = y)) \to \exists u \forall x \exists v (x \in z \to u \cap x = \{v\}).
Choice says that if the members of a set z are all non-empty and pairwise disjoint, then there is a choice set that has exactly one member in common with each member of z.
This is a further attempt to say that all subsets are to be added at each level, a try to somehow overcome the limitations of Subsets.

A feature implicit all across this view is that quantifiers range over the whole universe of sets. Thus, what the axioms assert is true about the universe of the so called real sets. This universe contains every ordinal and in it we can find uncountable sets that are really uncountable, and power sets that really contain every subset of a given set. This situation has always made me feel uncomfortable for several reasons. One of them is that the notion of truth for first order languages is that of truth on a model, where a model is a given set. We expect our quantifiers to run over a fixed collection of sets, one that we can look at as a completed totality, and not an ever increasing one like the real V. So let me now examine an alternative viewpoint.

An Alternative View

Instead of thinking of our quantifiers as ranging all over the real sets, let us suppose that they range only over a part of them (the members of a given set) satisfying certain closure conditions: certain axioms must be verified on it (this is to ensure that this piece of universe we are analyzing is broad enough to satisfy our mathematical needs for the purpose at hand). This standpoint’s sensibility is guaranteed by first-order logic completeness and the assumption that the theory is consistent.

By adding new axioms to our theory of sets we may either wish to make the theory about that part of reality we want to study more accurate (projective determinacy), or we may wish to extend the range of quantification to be able to explore a broader part of that reality by significantly changing the closure conditions (large cardinals).

Looking at things this way, however, has an unpleasant drawback. By examining things a bit more closely one can see that the axioms and theorems of our theory just say nothing about the real thing.

For a trivial example, we suppose that the piece of universe our quantification ranges over is a set, so our proof of the non-existence of a universal set actually does not tell that such a set does not exist in reality, it is only telling that we have not included it under the range of our quantifiers. There’s nothing wrong in our model of set theory being a set, we simply cannot include this set under the range of quantification of our theory on pain of inconsistency (Russell’s paradox).

But this story can easily be generalized. The proof that |a| < |\mathcal{P}(a)| is based on the existence of a subset we can easily obtain for each function f from a into \mathcal{P}(a), namely \{x \in a | x \notin f(x) \}. This subset, a member of \mathcal{P}(a), cannot be the image by f of any x \in a. Hence, no function from |a| into \mathcal{P}(a) can be onto on pain of inconsistency. It turns out that if the theory is consistent it has a countable model. So, we cannot tell if really |a| < |\mathcal{P}(a)| or instead we simply have left all functions from |a| onto |\mathcal{P}(a)| out of the range of our quantifiers so that it cannot be seen that actually |a| = |\mathcal{P}(a)| (Skolem’s paradox).

Something similar happens with non-well-founded models (non-standard models): we cannot tell if sets really are well founded or if the choice we have made of the sets for the model hides the fact that the membership relation is not really well-founded.

Reality could then be as the axioms say it is, or entirely different! So we face the chance that nothing can be known about the real sets, after all; or we can accept as real all possibilities. The real sets would then look more like an inexhaustible bag of dots we can pick and organize at will.

The cumulative hierarchy, however, is still very appealing, so one would like to reconcile both views by expecting every such possibility to be realizable in V when it really is what we say it is. This hope is expressed by Dana Scott in his preface to Bell’s Boolean-valued Models and Independece Proofs.

A point of view that this is too much to expect is the Multiverse View. See, for example, Hamkins’ slides or his answer to my question in MO. From this point of view, different realities correspond to different concepts of set, and this may likely include concepts of set we have not yet imagined (maybe ‘conceived’ would be more in place here), so there would seem to be little reason that any two concepts of set can be compared together in a unique set theoretic context.

The Cumulative Hierarchy of Sets

January 12, 2011 Leave a comment

The way the von Neumann cumulative hierarchy of sets is usually presented is the following:

  1. V_0 = \emptyset
  2. (Sucessor case) V_{\alpha + 1} = {\mathcal P}(V_\alpha)
  3. (Limit case) V_\alpha = \bigcup{\{V_\beta : \beta < \alpha\}}

Then, \textbf{V} = \bigcup{\{V_\alpha : \alpha \in \textbf{ON}\}}.

Let us examine this definition in slow motion.

Informally, we think of each V_\alpha as a small universe of sets. We think of sets as containers, uniquely identified by their content (I took this idea of contents and containers from Boolos and Jeffrey, Computability and Logic).

We start from nothing. V_0 is a universe where no sets exist, not even the empty set. \lnot \exists x(x=x) is true in V_0.

So we have already built one content: the empty one. And we are willing to attach a container to this content: the empty set \emptyset, but we cannot yet, that’s the task of the powerset operation.

The powerset operation applied on V_\alpha supplies us with new content by attaching a container, i.e. a set, to each possible content made of what we have obtained so far, i. e. to each possible combination of sets living in V_\alpha. This collection of sets is, then, the content of the following universe V_{\alpha + 1}. Actually not all of this content is entirely new, but that’s why we call the hierarchy of the V_\alphas “cumulative”.

So, the empty set will be the only content of our next universe V_1. Note that V_1 knows V_0 is a set, but V_0 doesn’t. This holds in general for any V_{\alpha + 1} and V_\alpha.

The union operation supplies us with a new content by collecting every set previously obtained by iteration of the powerset operation. Note that no container is attached to any content with this operation (no new sets are formed), only an entirely new content is generated. The subsequent application of the powerset operation to this new content is what is going to do the job of attaching a set to it.

Hence, for example, V_\omega is the set of hereditarily finite sets, but this set, like \omega, and their subsets, functions and relations, and in fact any other infinite set of hereditarily finite sets, lives in V_{\omega + 1}. In turn, {\mathcal P}(\omega), like V_{\omega + 1} = {\mathcal P}(V_\omega), lives in V_{\omega + 2}.

Note that this process goes on as long as there are ordinals available. Successors mark the places where we can apply the powerset operation to generate new sets and limits the places where we can look back and see what we have done.

A beginning

January 3, 2011 Leave a comment

“We start from the null set \emptyset, from it we obtain the set \{\emptyset\}, from the two sets \emptyset and \{\emptyset\} we obtain the sets \{\emptyset,\{\emptyset\}\} and \{\{\emptyset\}\}, and so on.” Azriel Levy, Basic Set Theory.

“Infinity is hard to comprehend” Steve Harris (Iron Maiden), Infinite Dreams (Seventh son of a seventh son).

%d bloggers like this: