A Short Excursion to Cantor’s Paradise

August 18, 2012 Leave a comment

What follows is a (possibly quite imperfect, but if so it’s only my fault) translation of Joan Bagaria’s popular exposition, originally written in Catalan, Una petita excursió al paradís de Cantor, posted here with the author’s permission.

A Short Excursion to Cantor’s Paradise

Joan Bagaria

1. Introduction.

On 4 June 1925, at a meeting of the Mathematical Society of Westphalia in honor of Karl Weierstrass, David Hilbert (1862-1943) gave his famous lecture On the infinite. In it, after recalling the crucial role of Weierstrass in the elimination of the vague ideas derived from the notion of infinitessimal and the consequent creation of a solid foundation for the calculus, he reminds that the infinite still plays a key role in mathematical analysis. Indeed, the infinite already appears in the definition of real numbers as infinite convergent sequences of rational numbers, and in the very notion of the set of real numbers. For Hilbert, to complete the foundation of mathematical analysis it is necessary to clarify the notion of actual infinity. In fact, Hilbert says:

The definitive clarification of the nature of the infinite, instead of pertaining just to the sphere of specialized scientific interests, is needed for the dignity of the human intellect itself.

In mathematical analysis the infinite appears as a limit notion, that is, the potential infinite. But this is not the real infinite, the infinite that we have when we consider, for example, the totality of natural numbers, or the totality of points of a line as a given and complete object. This is what we call the actual infinite. The mathematical discipline that deals with the actual infinite is set theory, which has its origins in the theory of transfinite numbers of Georg Cantor (1845-1918). This theory is, says Hilbert:

…the most admirable flower of the mathematical intellect and in general one of the highest achievements of purely rational human activity.

Here we are going to explore some of the most fascinating features of set theory. We will start with its origins in Cantor’s work, and its axiomatic formulation of Zermelo-Fraenkel with the axiom of choice. We will see how all mathematical objects: numbers, functions, groups, topological spaces, etc. are, in fact, sets, and how all mathematical activity of theorem proving can be reduced, formally, to making deductions in first-order logic from the axioms of set theory. This way set theory constitutes the standard model of mathematics. This allows us to make metamathematics, that is, to study mathematically mathematics itself. Set theory is, primarily, a theory of the infinite sets. We will thus see how far the succession of Cantor’s infinite numbers, the alephs, can go, from the first infinite number \aleph_0, which measures the quantity of natural numbers, to the large cardinals: the inaccessible, measurable, strong, Woodin, supercompact, etc. We will show how theorems about the real numbers can be proved by playing infinite games, and how the existence or not of large cardinals determines what can or cannot be proved. We will also see how apparently paradoxical situations can take place in mathematics, for example, how a sphere can be broken in 5 pieces and, after joining them back, have two spheres equal to the previous one. We will explain how objects can be added to the mathematical universe by Cohen’s forcing method and, this way, prove that something cannot be proved. As an example, we will explain why the famous Cantor’s Continuum Hypothesis, formulated in 1878, has not been resolved yet and why we do not know how many real numbers there are.

2. The origin of set theory. Cantor.

In 1874, Cantor proved that the set of algebraic real numbers is countable, that is, bijectable with \mathbb{N}. The algebraic real numbers are the real solutions of the equations with the form:
a_{n}x^{n} + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + ... + a_{1}x + a_{0} = 0
where the coefficients a_i are integers. Given an equation like the previous one, let its index be the natural number
|a_{n}| + |a_{n-1}| + |a_{n-2}| + ... + |a_{1}| + |a_{0}| + n
It is clear that for every k there is only a finite number of equations of index k. For example there is only one equation of index 2: x = 0, which has 0 as a solution. Four equations of index 3: 2x = 0, x + 1 = 0, x - 1= 0 and x^2 = 0, which have as solutions 0, 1 and -1, etc. Therefore, we can enumerate all algebraic numbers according to the index of the equation of which they are solutions, and, for a single equation, in increasing order.

Hence, the sets of natural numbers, integers, rational and algebraic numbers, are all countable and, consequently, mutually bijectable.

On the other hand, and surprisingly, Cantor showed that the set of points in the interval [0,1] cannot be enumerated. Let us see a simple proof: suppose that [0,1] and \mathbb{N} were bijectable. Each x of [0,1] has a binary representation, that is, a function f:\mathbb{N} \longrightarrow \{0,1\} such that
x = \sum\limits_{n \in \mathbb{N}} f(n) 2^{-(n+1)}
Since each x of [0,1] has at most two different binary representations, we would have a bijection F:\mathbb{N} \longrightarrow \{0,1\}^{\mathbb{N}}. Now let f:\mathbb{N} \longrightarrow \{0,1\} such that
f(n) = 0 iff F(n)(n) = 1
Then, f \neq F(n) for every n. But this is impossible, because every function f:\mathbb{N} \longrightarrow \{0,1\} belongs to the range of F.

Cantor also proved that \mathbb{R}, \mathbb{R}^n, and even \mathbb{R}^\mathbb{N} are mutually bijectable and with [0,1]. I see it, but I don’t believe it! Cantor wrote. All these sets are, hence, nondenumerable.

This way, Cantor, for the first time in history, discovered that there were at least two kinds of infinity: the countable infinite, and the uncountable infinite.

From 1878 to 1883 Cantor published a series of works that constitute the origin of set theory. In them he introduces the notion of transfinite, or infinite, ordinal. When we count with natural numbers, what we do is assign a number to each object we are counting, and we do this in an ordered way, respecting the order of natural numbers. When we finish counting, we obtain one number, the last one, that tells us how many objects we have. But by counting we are also ordering the objects that we count according to the order of natural numbers. Therefore, the natural numbers are at the same time cardinal numbers, that is, they denote a quantity, and ordinal numbers, that is, they denote order. But if we want to count real numbers, we will have exhausted all natural numbers before counting them all. Nothing precludes, however, from introducing ordinal infinite numbers and continue counting. For this, we need an ordinal number indicating the order that is obtained when natural numbers are finished. Cantor called this number \omega. Ordinal numbers start then, according to Cantor, like this:

1, 2, 3, 4, 5, 6, 7, ......, \omega, \omega+1, \omega+2, \omega+3, ......, \omega \cdot 2, \omega \cdot 2+1, \omega \cdot 2+2, ......,
\omega \cdot 3, \omega \cdot 3+1, \omega \cdot 3+2, ......, \omega^2, \omega^2+1, ......, \omega^2 + \omega, ......, \omega^2 + \omega \cdot 2, ......, \omega^2 \cdot 2, ......,
\omega^2 \cdot 2 + \omega, ......, \omega^3, ......, \omega^\omega, ......, \omega^{\omega^\omega}, etc.

We now have available more ordinals to continue counting after the natural numbers have been exhausted. An important observation, however, must be made: in the case of infinite numbers the notion of ordinal and cardinal do not coincide. For example, \omega and \omega+1 are different ordinals, but the same cardinal corresponds to both of them: \omega and \omega+1 are both denumerable, in the sense that if we have two sets A and B such that A has \omega elements and B has \omega+1, then A and B are bijectable with \mathbb{N}. Actually, each one of the infinite ordinals we have described is countable and, therefore, the same cardinal, the cardinal of \mathbb{N}, which Cantor represented by \aleph_0, corresponds to them.

Cantor called numbers of the second class those infinite ordinal numbers obtained from \omega by means of the processes of adding one unit and passing to the limit, and such that the set of all their predecessors is a countable set. All of the infinite ordinals described above are numbers of the second class.

Cantor discovered, however, that the set of numbers of the second class is not countable and that, in fact, the cardinal that corresponds to it is the next infinite cardinal, \aleph_1. Thus we have the first two infinite cardinals: \aleph_0, which is the cardinal of \mathbb{N} and coincides with the ordinal \omega, and \aleph_1, which is the cardinal of the set of numbers of the second class.

This process can be repeated over and over again, that is, we can consider the ordinal corresponding to the set of ordinals of the second class, that Cantor represented by \omega_1, and we can define the ordinals of the third class as those obtained by the processes of adding one unit and passing to the limit, and such that the set of their predecessors is of cardinality \aleph_1, that is, bijectable with the set of numbers of the second class. To the set of ordinals of the third class corresponds the next infinite cardinal \aleph_2, etc. This way are obtained the transfinite cardinals:

\aleph_0, \aleph_1, \aleph_2, \aleph_3, ......, \aleph_\omega, \aleph_{\omega+1}, ......, \aleph_{\omega_1}, ......, \aleph_{\omega_\omega}, ......, \aleph_{\omega_{\omega_\omega}}, ......

where for each ordinal \alpha we have a cardinal \aleph_\alpha.

In 1885 Cantor suffered his first mental crisis, due mostly to the attacks of Leopold Kronecker (1823-1891) to the new theory of sets, and to his attempts to forestall the publication of Cantor’s works. To this crisis also contributed, doubtlessly, Cantor’s unsuccessful endeavor to resolve the Continuum Hypothesis, a hypothesis he had formulated as soon as 1878, and which says that every infinite set of real numbers is either enumerable or bijectable with the continuum \mathbb{R}. As a previous step, Cantor had conjectured and tried to prove that every set, and particularly the set of real numbers, can be well-ordered, that is, it can be totally ordered in such a way that every non-empty subset has a minimal element. This is known as the Well-Ordering Principle. Every well-ordered set can be counted, that is, it can be put in one-to-one correspondence with an initial segment of the ordinal numbers. If we accept this principle, the Continuum Hypothesis says that the set of ordinals less than \omega_1 is bijectable with the set of real numbers or, in other words, that the cardinality of \mathbb{R} is the least possible, that is, \aleph_1. Hilbert put the Continuum Hypothesis in the first place of his famous list of problems of 1900.

3. The paradoxes.

In 1895 Cantor published the first volume of his work Contributions to the founding of the theory of transfinite numbers, and the second in 1897, where he develops the general theory of sets and expounds all the theory of transfinite ordinals and cardinals. But the same year 1897 Cesare Burali-Forti (1861-1931) published a paradox of the theory of sets, a paradox that Cantor had also discovered. It appears at considering the set ON of all ordinals. To this set corresponds an ordinal number which should be a member of ON and at the same time greater than all the members of ON. In a letter to Dedekind in 1899, Cantor solves the paradox by making a distinction between what he calls consistent multiplicities, which are sets, and inconsistent multiplicities, like the class of all ordinals, which lead to inconsistencies or paradoxes if they are considered as a set.

That the notion of arbitrary set was problematic was revealed in 1902 by the famous paradox of Bertrand Russell (1872-1970): Let A be the set of all sets which are not member of themselves. Then, if A belongs to A, it does not belong to A, and if it does not belong to A, then A belongs to A. The conclusion is that A cannot then be a set.

It follows that the set of all ordinals, the set of all sets that are not members of themselves, or the universal set, or the set of all sets, do not exist!

Clearly, then, it was necessary to establish in a precise manner which are the criteria of formation of arbitrary sets. An axiomatization of the theory of sets was needed in order to avoid the paradoxes or contradictions derived from considering inconsistent multiplicities as sets. No one shall drive us out of the paradise which Cantor has created for us, said Hilbert in his 1925 conference.

4. The axiomatic theory of sets.

The first axiomatization of the theory of sets was due to Ernst Zermelo (1871-1953) and published in 1908. Zermelo had already proved Cantor’s Well-Ordering Principle using what is now known as the Axiom of Choice, an axiom which, actually, is equivalent to the Well-Ordering Principle.

Zermelo’s axiomatization was subsequently enhanced by A. Thoralf Skolem (1867-1963) and A. Abraham H. Fraenkel (1891-1965) and has become what is known as the theory of sets of Zermelo-Fraenkel (ZF) with the Axiom of Choice, or ZFC.

The axioms of ZFC can be seen as a description of of the universe of all pure sets. This universe forms a well-ordered and cumulative hierarchy, indexed by the ordinal numbers, starting with 0. That is, let

V_0 = \emptyset, the empty set.
V_{\alpha + 1} = {\mathcal P}(V_\alpha), the set of all subsets of V_\alpha.
V_\lambda = \bigcup \limits_{\beta < \lambda} V_\beta, the union of all V_\beta, \beta less than \lambda, if \lambda is a limit ordinal.

The universe of all sets is the union of all V_\alpha:

V = \bigcup \limits_{\alpha \in \textbf{ON}} V_\alpha.

The axioms of ZFC are the following:

(1) Extensionality. If two sets have the same elements, they are equal.
(2) Power Set. Given a set x, there is a set {\mathcal P}(x) which has as members all the subsets of x.
(3) Union. Given a set x, there is a set \bigcup x which has as members all the members of the members of x.
(4) Infinity. There is an infinite set.
(5) Axiom of Substitution. Every definable function which has a set as its domain, has also a set as its range.
(6) Regularity. Every set belongs to V.
(7) Axiom of Choice (AC). The Well-Ordering Principle, that is, every set can be well-ordered.

The axioms of ZFC are specially designed to build V. For example, the axiom of substitution serves to be able to continue in the limit \lambda stages, that is, to put together in one set all the V_\beta, \beta less than \lambda, in order to be able to apply the union axiom and this way obtain V_\lambda. In ZFC, the ordinals are identified, according to the definition of John Von Neumann (1903-1957), with the set of all their predecessors. So, 0=\emptyset, 1 = \{0\}, …, n+1=n \cup \{n\}, …, \omega=\{0,1,2,3,...\}, \alpha + 1 = \alpha \cup \{\alpha\}, etc. The cardinal numbers are those ordinals that are not bijectable with any of their elements. So, \aleph_0 = \omega, \aleph_1 = \omega_1, …, \aleph_n = \omega_n, …, \aleph_\omega = \bigcup_n \aleph_n, etc. AC states that every set A can be well-ordered and, therefore, is bijectable with a cardinal.

In axiomatic set theory, these axioms are formulated in a precise way, using the formal language of first-order logic with one non-logical symbol only, the symbol \in, which represents the membership relation. For example, in this formal language, the axiom of extensionality is written like this:

\forall x \forall y (\forall z(z \in x \leftrightarrow z \in y) \to x=y)

This formalization is particularly necessary to be able to formulate correctly the axioms of substitution, since we need to precise first what we mean by definable. What we understand is: definable by a formula of the formal language. Thus, we have an infinity of axioms of substitution, one for each formula of the formal language.

5. Set theory and the foundation of mathematics.

ZFC enables not only to develop all the theory of transfinite numbers of Cantor, but also, as a theory maximally general and abstract, based in the fundamental notion of arbitrary set, to develop all of classical mathematics. This means that all mathematical objects can be seen as sets, and that all mathematical theorems can be proved from ZFC by means of the usual logical derivation rules, which in turn are formalizable in the formal language of set theory.

The natural numbers are the finite ordinals. The binary relations in a set are identified with sets of ordered pairs of members of that set. For example, the relation of divisibility between natural numbers is identified with the set of all ordered pairs \langle m,n \rangle of natural numbers such that m divides n. The ordered pair \langle m,n \rangle is nothing more than the set \{\{m\},\{m,n\}\}. The integers are built as equivalence classes of ordered pairs of natural numbers, and the rationals as equivalence classes of ordered pairs of integers. The real numbers are just Dedekind cuts, which are non-empty sets of rationals bounded and closed under predecessors. The n-ary functions are only a special kind of n+1-ary relations where we identify the function f with the set of ordered n+1-tuples \langle x_1,...,x_n,y \rangle such that f(\langle x_1,...,x_n \rangle)=y. A topological space X is merely a set with a topology, which is a set of subsets of X closed under arbitrary unions and finite intersections, etc.

Let us take any classical theorem, for example, Bolzano’s theorem which says that if a < b are real numbers, f:[a,b] \to \mathbb{R} is continuous, and f(a)<0<f(b), then there is a c \in [a,b] such that f(c)=0. We have just observed that all mathematical objects appearing in this theorem can be seen as sets, and the basic properties of these objects can be proved formally from ZFC, for example, the continuity of real numbers. Hence, the usual proof of the theorem can be seen as a formal proof, through the application of the rules of first-order logic, starting from ZFC. Of course, to write a complete proof using the formal language is both difficult and unwise for practical reasons, since the proof would be not only extremely long but also intuitively incomprehensible. What matters here, however, is to see that this is, in principle, possible.

The fact that all classical mathematics can be formulated and developed within the ZFC axiomatic set theory makes possible metamathematics, that is, the mathematical study of mathematics itself. Questions as proving mathematically the possibility or impossibility to prove or not a mathematical statement make now full sense, if we understand by mathematical statement a formal statement of set theory and as a mathematical proof a formal proof from ZFC.

6. Undecidable statements. Gödel’s incompleteness theorems.

Let \varphi be a mathematical statement. We can ask ourselves whether \varphi is true or not. But in mathematics, true means provable. The truth of a statement is established by means of a proof from basic principles or axioms. How can we, then, know if \varphi is provable? If there exists a proof of \varphi, then we will have to find it. But, what if it does not exist? How could we know that there is no proof of \varphi? One way would be finding a proof of the negation of \varphi. This would tell us that there cannot be any proof of \varphi, unless mathematics was inconsistent! But, what if the negation of \varphi was not provable either? Then \varphi would be undecidable or independent of the axioms.

Gödel proved in 1931 his famous incompleteness theorems. The first of them says that in any consistent formal system allowing to develop basic arithmetic there are undecidable statements, statements such that neither they nor their negations are provable in the system. In particular, there are statements in the formal language of set theory that are neither provable nor refutable from the axioms of ZFC, assuming, of course, that ZFC is consistent.

Is ZFC consistent? The second Gödel’s incompleteness theorem, even more surprising than the first one, tells us that in any consistent formal system allowing to develop basic arithmetic, one undecidable statement is just the one that states the system’s own consistency. That means that if ZFC is consistent, then we cannot prove its consistency from the axioms of ZFC. The statement that asserts the consistency of ZFC, let us call it CON(ZFC), is the statement:

0 = 1 is not provable in ZFC

This is an arithmetic statement, since it says that the succession of symbols 0=1 is not the last step of any proof from ZFC, and symbols, statements and proofs alike can be easily codified by finite sequences of natural numbers. The conclusion is then that, in any axiomatic formal theory that contains ZFC and is consistent, there will always be undecidable arithmetic statements.

ZFC is nowadays accepted as the standard formal system in which mathematics is developed. This means that a mathematical statement is a theorem, it is true, if it is provable in ZFC. But what happens with undecidable statements? If all undecidable statements where of the kind of CON(ZFC), this would not worry us much, since they do not seem to affect directly the kind of mathematical problems one usually considers. But, for better or worse, this is not the case. As we will see below, there are many natural mathematical statements that are not decidable in ZFC.

Given a mathematical statement, we have, hence, three possibilities: that it is provable in ZFC, that its negation is provable in ZFC or that it is undecidable in ZFC. But, how could we manage to see that a mathematical statement is undecidable? This is possible due to the fact that ZFC is a first-order formal theory, and that Godel’s first-order logic completeness theorem tells us that to see that a statement \varphi is not provable in ZFC we only need to find a model of ZFC where \varphi is false. A model M of ZFC is an ordered pair \langle M,E_M \rangle where M is a non-empty set, E_M is a binary relation in M, and all the axioms of ZFC are true in \langle M,E_M \rangle, interpreting the elements of M as sets and E_M as the membership relation. So, if we could find models M and N of ZFC such that \varphi is true in M and false in N, this would mean that \varphi is undecidable.

Note that one consequence of Gödel’s second incompleteness theorem is that it is not possible to prove in ZFC the existence of a model of ZFC, for the existence of a model of ZFC is equivalent to the consistency of ZFC. Hence, when we talk about models of ZFC we are assuming that ZFC is consistent. The completeness theorem states, precisely, that ZFC is consistent if and only if it has a model.

One of the most surprising results of the mathematics of the 20th century is that Cantor’s Continuum Hypothesis is undecidable.

7. The Continuum Hypothesis I.

Let us recall that the Continuum Hypothesis says that every infinite set of real numbers is either enumerable or bijectable with \mathbb{R}. In ZFC, since we have the Well-Ordering principle as an axiom, the Continuum Hypothesis is equivalent to state that the cardinality of \mathbb{R} is \aleph_1.

In 1938, Gödel constructed a model of ZFC where the Continuum Hypothesis is valid. This is the so called Gödel’s Constructible Universe and is represented by the letter L. L is constructed like V, but just as to pass from V_\alpha to V_{\alpha+1} we took all the subsets of V_\alpha, now to pass from L_\alpha to L_{\alpha+1} we only take those subsets of L_\alpha that are definable in L_\alpha. To construct L we do not need to make use of the Axiom of Choice (AC), but once constructed it can be verified that AC is valid in L. This shows, then, that if ZF is consistent, then ZFC is also consistent, and that, hence, we can make use of AC without fear of introducing any contradiction in ZF. That the Continuum Hypothesis is valid in L is due to the fact that to L only belong the strictly necessary sets, and, hence, being the smallest possible model of ZFC, in it the cardinality of \mathbb{R} is the smallest possible, \aleph_1.

Since the Continuum Hypothesis is valid in L, and L is a model of ZFC, this means that it cannot be proved in ZFC the negation of the Continuum Hypothesis, for if this was possible, then the Continuum Hypothesis would be false in all models of ZFC.

After Gödel’s result, and in view of the various fruitless attempts to prove the Continuum Hypothesis in ZFC, the idea that maybe it was undecidable started to gain traction. What was necessary was to find a way to construct a model of ZFC where the Continuum Hypothesis was false. This was attained 25 years later by Paul J. Cohen (1934-2007) by means of a revolutionary method, forcing, for which the Fields Medal was awarded to him.

8. Forcing.

The forcing technique, invented by Cohen, is an extraordinarily flexible method of construction of models of ZFC. It allows to construct models with the most diverse properties and with a great control over the statements that will end up being true in the model that is being constructed, showing thus the consistency and undecidability of many mathematical statements.

Similarly to how we pass from a field K to an algebraic extension K[a], for example, from the field of rationals \mathbb{Q} to the field \mathbb{Q}[\sqrt{2}], we pass from a model M of ZFC to a forcing extension M[G]. Despite the analogy, however, the differences are enormous. The forcing method is of an extraordinary sophistication and technical complexity, in which combinatorial or topological, logical and metamathematical aspects alike intervene.

To give a brief idea, let us consider the case in which we want to obtain, starting from a model M, a forcing extension where the Continuum Hypothesis is false. We will have to add to M at least \aleph_2 new real numbers. The first problem that we meet is that, since M is a model of ZFC, we cannot prove in ZFC that there are any real numbers that are not yet in M. Where are we going to take those new reals we want to add from? To simplify the problem, let us suppose by now that we want to add only one new real, and to simplify it even more, that what we want to add is the binary representation, that is, an infinite sequence of zeroes and ones. The idea is to work with the set of finite approximations to the infinite sequence of zeroes and ones that we want to add. Ordering the finite sequences in a natural way what we have is the full binary tree, and the infinite sequence that we want to add is just an infinite branch of that tree. The question is to find a branch that does not belong to M. This is going to happen if that branch is what is called generic over M, which means that it intersects all open dense subsets of the binary tree, with the order topology, that belong to M. But how could we find a generic branch over M? The answer is that there are going to be if M is countable! But how could M be countable if it is a model of ZFC, and in ZFC we have shown that there are uncountable sets, like the set of real numbers? The answer is given by the Löwenheim–Skolem theorem, which says that for every model M of ZFC there is another one N \subseteq M which is countable and satisfies exactly the same statements. Hence, we can assume that M is countable, and take a branch b of the binary tree that is generic over M. The resulting model M[b] continues to be a model of ZFC and contains the same ordinals as M. To verify all of this is not trivial at all, and to be able to do it one needs the forcing relation, represented by \Vdash, between the approximations to the generic branch b and the formulas of the language of set theory with names for the elements of the generic extension M[b]. The Forcing Theorem says that a statement \varphi is valid in M[b] if and only if there is a finite approximation p \subseteq b such that p \Vdash \varphi. But to make the Continuum Hypothesis fail, we need to add at least \aleph_2 new generic branches over M. This is done in a similar way, but simultaneously with \aleph_2 copies of the full binary tree. There is one last difficulty, and it is that we have to make sure that when we add the new \aleph_2 branches, the cardinal \aleph_2 of M will continue to be the \aleph_2 of the generic extension. It could happen, and in fact it happens very frequently, that some cardinals of the model M are not yet cardinals of the generic extension. Fortunately, in this case it does not happen due to the topological properties of the binary tree. But to verify this, one needs once again to make us of the forcing relation \Vdash.

The Continuum Hypothesis is valid in Gödel’s model L, and is not valid in Cohen’s model. That is, it is independent of ZFC.

The forcing technique allows to add to a countable model M of ZFC practically any kind of object. Moreover, it can be iterated, thus obtaining subsequent generic extensions. This way models of ZFC with very diverse properties can be constructed, allowing to prove the independence of ZFC of a large number of mathematical statements. The host of independence results obtained by means of forcing in the seventies and the seventies revealed the limitations of ZFC to give answers to many fundamental mathematical questions. It was necessary, then, to find new axioms which, added to ZFC, would allow to resolve all these questions.

9. Large cardinals.

As we have already seen at the beginning, the universe of all sets V is defined as the union of the transfinite sequence of the V_\alpha. We have yet seen that V cannot be a set, for then Russell’s paradox appears. But if V was a set, then a cardinal, \kappa, would correspond to it, which in fact would coincide with the \kappa-th cardinal, \aleph_\kappa. Moreover, V_\kappa would be a model of ZFC. Obviously we cannot prove in ZFC that there is a \kappa with these properties, for then, since V_\kappa is a model of ZFC, we would have proved in ZFC that ZFC has a model and that, therefore, it is consistent, and we have seen that this is impossible by Gödel’s second incompleteness theorem. Why do not we, then, add to ZFC the axiom that says that there is a cardinal \kappa which is the \kappa-th cardinal and such that V_\kappa is a model of ZFC?

This axiom was propounded in 1930 by Wacław Sierpiński (1882-1969) and Alfred Tarski (1902-1983), and is the first of the large cardinal axioms. A cardinal \kappa with the mentioned properties is called inaccessible.

Other axioms of existence of large cardinals, much more powerful than the inaccessibles, subsequently appeared all along the 20th century. The most important of them, the measurable cardinals, were also discovered in 1930 by Stanisław Ulam (1909-1984) (discoverer too, on the other hand, of the ignition method of the hydrogen bomb) as a consequence of his works on extensions of the Lebesgue measure.

10. Measurable sets.

Let us recall that a set of real numbers is Borel if it can be obtained from the open sets by taking complements and countable intersections. Let us recall, also, that a set of real numbers has measure zero, or is null, when for every \epsilon > 0 there is a sequence of open intervals \{I_n\}_n such that A \subseteq \bigcup_n I_n and \sum_n \lvert I_n \rvert < \epsilon. A is Lebesgue measurable if it is almost a Borel set, that is, if it differs from a Borel set by a measure zero set. To every measurable set A corresponds a number \mu (A) \in [0,\infty], its measure, which is translation invariant and \sigma-additive, that is, the measure of a countable union of measurable and pairwise disjoint sets is the sum of the measures of those sets. Moreover, the measure of an interval is its length.

In ZFC, the existence of sets of real numbers which are not Lebesgue measurable is provable. For example, the set discovered in 1905 by Giuseppe Vitali (1875-1932): let A a maximal subset of [0,1] with the property that for any different x,y \in A, x-a is not rational. The existence of A is guaranteed by the axiom of choice. In order to see that A is not measurable, for every rational p, consider the set A_p = \{x+p : x \in A \}. All these sets are pairwise disjoint. Let B the union of all the A_p, where p is a rational in the interval [-1,1]. latex A$ cannot have measure zero, for then B would have measure zero, and this is impossible because [0,1] \subseteq B. On the other hand, A cannot have positive measure, for then B would have infinite measure, and this cannot happen since B \subseteq [-1,2].

Trivially, all Borel sets are measurable, and in 1917 Lusin proved that all continuous images of Borel sets, the so called analytic sets, are also measurable. Since if a set is measurable, then so is its complement, all complements of analytic sets, the so called coanalytic sets, are also measurable. The natural question is then if the continuous images of coanalytic sets, the so called \Sigma^1_2 sets, are also measurable. But the answer is undecidable in ZFC! In L there are \Sigma^1_2 sets that are not measurable, and with forcing one can construct models where all \Sigma^1_2 sets are measurable.

The existence of sets that are not Lebesgue measurable is due to the fact that the Lebesgue measure is translation invariant. In fact, there can be no measure which extends the Lebesgue measure, is \sigma-additive and translation invariant and which measures all the sets of real numbers. The question if there might be any measure which extends the Lebesgue measure, \sigma-additive and which measures all subsets of some set A, is known as the measure problem. Ulam proved in 1930, that if there existed a set A with such a measure, then either the cardinity of \mathbb{R} would be immense or there would exist a cardinal \kappa called measurable, which is inaccessible, and in fact much more than inaccessible, it is the \kappa-th inaccessible, etc. The existence of measurable cardinals is, therefore, not provable in ZFC, even if we add the axiom of existence of inaccessible cardinals.

11. More large cardinals.

After the independence results arising from the invention of forcing, it was attempted to see if large cardinal axioms allowed to decide some of the questions undecidable in ZFC. Soon it was seen that large cardinal axioms did not allow to say anything new about the cardinality of \mathbb{R}, and thus did not resolve the Continuum Hypothesis. However, surprisingly, Robert M. Solovay proved in 1969 that if there is a measurable cardinal, then all \Sigma^1_2 sets of reals are Lebesgue measurable. What is surprising is that measurable cardinals, so far away from the real numbers in V, have so much influence over the basic properties of real numbers.

In the seventies new large cardinal axioms were discovered, even much stronger than the axiom of the existence of measurable cardinals. Curiously, the existence of a measurable cardinal is equivalent to the existence of an elementary embedding (different from identity) of V in the subuniverse M. That is, a function

j: V \longrightarrow M

which respects every statement, i. e. if x is a set and P is a property definable in the formal language of set theory, then x has the property P in V if and only if j(x) has the property P in M. The measurable cardinal is the so called critical point of the embedding, that is the first ordinal \kappa such that j(\kappa) \neq \kappa. A famous result of Kenneth Kunen in 1971 says that there cannot be any elementary embedding of V into V. But demanding that M should be very similar to V stronger and stronger large cardinal axioms are obtained. For example, if we demand that for each ordinal \alpha there is an elementary embedding j: V \longrightarrow M with critical point \kappa and with V_\alpha \subseteq M, \kappa is a so called strong cardinal. If we demand that for each \alpha there is an elementary embedding j: V \longrightarrow M with critical point \kappa and with every sequence of length \alpha of elements of V contained in M, \kappa is a so called supercompact cardinal, etc. The existence of a supercompact entails the existence of many smaller strong cardinals, and the existence of a strong cardinal the existence of many smaller measurable cardinals.

Which effects have the existence of very large cardinals, like the supercompact, on questions like the Lebesgue measurability of sets of real numbers?

12. The Banach-Tarski Paradox.

In 1924, Stefan Banach (1892-1945) and Alfred Tarski, based on a theorem of Felix Hausdorff (1862-1943) from 1914, proved the following theorem, known as the Banach-Tarski Paradox:

There is a partition of the unit ball B of \mathbb{R}^3, in 5 pieces B_0,...,B_4, and there are isometries \rho_0,...,\rho_4:\mathbb{R}^3 \longrightarrow \mathbb{R}^3 such that

B = \rho_0[B_0] \cup \rho_1[B_1] \cup \rho_2[B_2] = \rho_3[B_3] \cup \rho_4[B_4]

In other words, it is possible to partition the unit ball in 5 pieces in such a way that after reassembling them we have two balls equal to the original (!). Actually, given any two bounded sets A and B with non-empty interior, of the three-dimensional space \mathbb{R}^3, A can be partitioned in a finite number of pieces and reassembled by rigid moves in such a way that they form B.

How is it possible that this can be done? Is this not clearly false? The reason why this can be done is that not all pieces are Lebesgue measurable! In the definition of the pieces AC is used in an essential way. The Banach-Tarski theorem is possibly the most anti-intuitive consequence of AC. Yet, this axiom continues to have a fundamental, and (almost) universally accepted by the mathematical community, role. A big part of current mathematics would be impossible without the axiom of choice. The anti-intuitive consequences of AC are due to the fact that the axiom states the existence of sets that are not explicitly definable. For example, it states that there is a well-order of \mathbb{R}, but, which is this order, is there any way to define it? AC allows us to partition B in non-measurable pieces. But how are these pieces? Can they be explicitly defined? As we are going to see, the answer is no, if supercompact cardinals exist!

13. Determinacy.

An efficient way of proving that a set of real numbers has a certain property is by playing infinite games. Given A \subseteq [0,1], we can play the following game associated to A: there are two players Adam and Eve, who play alternatively n_i \in [0,1]. Adam plays n_0, then Eve plays n_1, next Adam plays n_2, and so on. A game looks like this:

\begin{array}{c || *{12}{c}}  \bf{Adam} & n_0 & & n_2 & & n_4 & & ... & & n_{2k}& & ... & \\  \hline  \bf{Eve} & & n_1 & & n_3 & & ... & & ... & & n_{2k+1} & & ... \\  \end{array}

At the end of the game, the players have produced an infinite sequence n_0,n_1,n_2,... of zeroes and ones. Adam wins the game if

\sum \limits_{i=0}^\infty \frac{n_i}{2^{i+1}} \in A

and otherwise Eve wins.

The game is determined if one of the players has a winning strategy. That is, if there is a function f that assigns 0 or 1 to each finite sequence of zeroes and ones, in such a way that if one of the players, let us say Eve, plays according to this function, always playing f(n_0,n_1,...,n_{2k}) in the k-th turn, then she wins the game independently of what Adam plays. We say that the set A is determined if the game associated with A is determined.

In ZFC it can be proved that there are non-determined sets. But Donald Martin proved in 1975 that every Borel set is determined, and if a measurable cardinal exists, then every analytic set is determined.

14. Projective sets and descriptive set theory.

The projective sets of real numbers are those obtained from Borel sets by taking complements and continous images. The analytic, coanalytic and \sigma_2^1 sets are, hence, projective. The projective sets form a hierarchy of increasing complexity, in terms of the number of necessary steps to obtain them from the Borel sets. The study of the properties of projective sets, like their Lebesgue measurability, constitutes the core of the so called descriptive set theory. If we want to prove that the sets of real numbers have a certain property we can do it by first proving it for the Borel sets, then for the analytic, then for the conanalytic, then for the \Sigma_2^1, etc. up till proving it for all of the projective sets. Since all sets of real numbers appearing normally in mathematical practice are projective, the ability to prove that they have certain properties is of great interest. Moreover, all results about projective sets of real numbers have an immediate translation to projective sets in any polish space (separable and completely metrizable), like \mathbb{R}^n, \mathbb{C}, separable Banach spaces, etc.

In the sixties and seventies a number of results were proved showing that practically any kind of property shared by all Borel sets, is also shared by all projective sets, assuming that all projective sets are determined. Hence, the determinacy of projective sets, although not provable in ZFC, constitutes an excellent candidate as a new axiom of set theory, since it resolves practically any question about the properties of projective sets. For example, if every projective set is determined, then all projective sets are Lebesgue measurable, there is no projective well-order of the real numbers, and if they are not enumerable, then they are bijectable with \mathbb{R}.

The axiom that says that all projective sets are determined is known by the name of Projective Determinacy (PD). PD implies, then, that there is no projective counterexample to the Continuum Hypothesis, that every projective set is Lebesgue measurable, and, therefore, that the pieces in which the sphere in the Banach-Tarski theorem is partitioned cannot all of them be projective. PD is considered as an axiom at the same time natural and extremely useful to resolve questions about the projective sets. Actually, W. Hugh Woodin has proved that, in some sense, PD answers all the questions about projective sets.

One of the greatest and most surprising recent developments in set theory has been proving that PD follows from the existence of large cardinals. Donald Martin and John Steel proved in 1988 that if there are infinitely many Woodin cardinals, which are between the strong and the supercompact, then PD is valid. Furthermore, W. Hugh Woodin proved that if there are infinitely many Woodin cardinals with a measurable above them all, then not only the projective sets, but all the sets of reals which are definable from real numbers and ordinals are determined. And he also proved that the existence of infinite Woodin cardinals is a necessary hypothesis to obtain the consistency of PD. These results show that large cardinal axioms, at least the existence of infinitely many Woodin cardinals, are axioms both natural and reasonable, which decide most of the questions relative to definable set of real numbers.

Despite the enormous success of large cardinal axioms to decide questions about the definable sets of real numbers, and many other questions in other areas of mathematics, there are many fundamental questions that large cardinal axioms cannot resolve. The most important of them is the Continuum Hypothesis.

15. Forcing axioms.

Besides the Continuum Hypothesis and the measure problem, another key problem for the development of set theory has been Suslin’s Hypothesis. Cantor had proved that any linear order S which is dense, without extreme points, separable and complete is isomorphic to the real line. Mikhail Y. Suslin (1894-1919) conjectured that if instead of assuming that S is separable we assume that it is ccc, that is, that any disjoint collection of open intervals of S is countable, then S is also isomorphic to \mathbb{R}. Ronald Jensen proved in 1968 that Suslin’s Hypothesis fails in L. On the other hand, Robert M. Solovay and Stanley Tennenbaum constructed in 1971 a model with forcing where Suslin’s Hypothesis is valid, showing, thus, that it is undecidable in ZFC.

Based on the construction of the model of Solovay and Tennenbaum where Suslin’s Hypothesis is valid, Donald A. Martin isolated a new principle known as Martin’s Axiom (MA). This principle is equivalent to the following natural generalization of the Baire Category Theorem, and says that:

In every space which is compact, Hausdorff and ccc, the intersection of \aleph_1 open dense subsets is dense.

It can easily be seen that MA implies the negation of the Continuum Hypothesis. MA has been used with great success to resolve many questions undecidable in ZFC. For example, it implies Suslin’s Hypothesis and that every \Sigma_2^1 set of real numbers is Lebesgue measurable. Even so, MA has been considered as a useful principle to prove that certain statements were consistent with ZFC, but not as an axiom.

As a culmination of a number of stronger and stronger generalizations of MA introduced in the seventies and eighties, Matthew Foreman, Menachem Magidor and Saharon Shelah discovered in 1988 the strongest possible version of MA, the so called Martin’s Maximum Axiom (MM), and proved that it was consistent with ZFC assuming that the existence of a supercompact cardinal was also consistent. MM is formulated like MA but for spaces which have a property called of preservation of stationary sets. The most surprising and interesting is that they proved that MM implies that the cardinality of \mathbb{R} is \aleph_2!

The question is: to what extent MA and MM are natural axioms of set theory and not only principles useful to obtain results of consistency with ZFC? For a start, they do not seem at all evident or natural principles, in the same sense as large cardinal axioms do.

16. Generic absoluteness.

The forcing method allows to construct extensions of countable models. The extensions that are obtained depend on a partial ordering \mathbb{P}, which is the partial order of the approximations to the new, or generic, object that we want to add. But we can also consider forcing, or generic, extensions of V. Clearly, these are ideal extensions, since all sets are already in V. Given a partial order \mathbb{P}, we can define the Boolean Model V^\mathbb{P}, which represents the model obtained if there was a generic extension of V obtained by means of \mathbb{P}.

Given an infinite cardinal \kappa, let H(\kappa) the set of all sets hereditarily of cardinality less than \kappa, that is, the set of all sets X such that X, the elements of X, the elements of elements of X, etc. all of them have cardinality less than \kappa. Hence, H(\omega) is the set of all hereditarily finite sets, and, therefore, H(\omega) = V_\omega. H(\omega_1) is the set of all hereditarily countable sets. We always have H(\kappa) \subseteq V_\kappa, but not the other way round. If \kappa is inaccessible, then H(\kappa) = V_\kappa. The H(\kappa) also form a cumulative hierarchy, in the same way as the V_\kappa do, and the union of all the H(\kappa) is V.

There always is a certain degree of absoluteness between V^\mathbb{P} and V. For example, by the Levy-Shoenfield Absoluteness Theorem we have that given an existential statement talking about sets in H(\omega_1), if it could be made valid in a generic extension, then it would also be valid in V. In other words, if by forcing we could make valid a statement that says that there is a set X with certain properties which depend exclusively of sets in H(\omega_1), then such a set X already exists.

Recall that the axiom of existence of an inaccessible cardinal says that if the construction of V could be continued yet one more step beyond all ordinals, then the cardinal that would be obtained, which would be an inaccessible, exists. Generalizing in a similar fashion the Levy-Shoenfield absoluteness theorem, we would say that every existential statement talking about sets in H(\omega_2) and valid in V^\mathbb{P}, is also valid in V. That is, if by means of forcing we could make valid a statement that says that there is a set X with certain properties which depend exclusively of sets in H(\omega_2), then such a set X already exists. Let us represent this assertion by

H(\omega_2) \prec_\exists V^\mathbb{P}

This generalization would then be a new natural axiom of set theory, if it was consistent. Unfortunately, it is not, unless we restrict the kind of generic extensions.

Now, it turns out that MA is equivalent to H(\omega_2) \prec_\exists V^\mathbb{P} for every \mathbb{P} which is ccc, that is, for every \mathbb{P} in which every set of pairwise incompatible elements is countable. Therefore, MA is indeed a natural axiom of set theory. If we formulate the axiom corresponding to MM, that is, H(\omega_2) \prec_\exists V^\mathbb{P} for every \mathbb{P} with the stationary set preservation property, then we obtain the so-called Bounded Martin’s Maximum Axiom (BMM).

BMM is consistent with ZFC, assuming that the existence of a supercompact cardinal is consistent. In contrast to MM, BMM is a natural axiom of set theory, of a similar kind to large cardinal axioms, but which, unlike those, decides many questions that large cardinal axioms do not resolve. As a culmination of a number of results by W. Hugh Woodin and David Asperó, Stevo Todorčević proved, only some months ago, that BMM implies that the cardinality of \mathbb{R} is \aleph_2. This is a remarkable result, since, for the first time, a natural extension of ZFC decides the cardinality of the continuum.

17. The Continuum Hypothesis II.

To conclude, we will briefly mention, and very informally, some recent results by Woodin.

The first result says that if there is a proper class of Woodin cardinals, that is, if for every ordinal there is a higher Woodin cardinal, then the theory of H(\omega_1) is generically absolute. This means that we cannot change the theory of the real numbers with forcing, and implies, for example, that every definable set of real numbers is Lebesgue measurable.

The second result says that there is an axiom, called (\ast), which is a generalization of BMM, that practically decides every question about H(\omega_2), in such a way that the theory of H(\omega_2) is generically absolute. The (\ast) axiom is the equivalent to PD, but for H(\omega_2). The axiom (\ast) implies that the cardinality of the continuum is \aleph_2.

Finally, Woodin proved in 2000 that if there is there is a proper class of Woodin cardinals, then any axiom which decides practically every question about H(\omega_2) in such a way that the theory of H(\omega_2) is generically absolute, necessarily must imply the negation of the Continuum Hypothesis.

The notion of practically every question about H(\omega_2), to be precisely formulated, requires the introduction of a new logic, the so-called Woodin’s \Omega-logic$. The most important problem pending to be resolved about the Continuum Hypothesis is the so-called \Omega-conjecture. The formulation of this conjecture is of a highly technical character. The conclusion is if the \Omega-conjecture is true, then any set theory compatible with the existence of large cardinals and which makes the theory of H(\omega_2) generically absolute, must necessarily imply the negation of the Continuum Hypothesis.

Hence, if the \Omega-conjecture is true, then the Continuum Hypothesis is, in a very natural sense, false. The theories: ZFC plus the Continuum Hypothesis and ZFC plus its negation are not equally reasonable. Woodin’s results show that, in the presence of large cardinals, if we accept the Continuum hypothesis, then we cannot answer many questions about H(\omega_2), and, henceforth, about sets of real numbers.

18. Concluding remarks.

In this little excursion to Cantor’s paradise, we have seen that set theory has experienced through the twentieth century an extraordinary development. What started as a mathematical theory of transfinite numbers, has evolved to constitute a general theory of infinite sets and a foundation for all mathematics. The techniques developed by set theory, like forcing, infinite combinatorics, the theory of large cardinals, etc. have turned it into a discipline of high complexity and beauty, with fascinating results which stimulate and at the same time defy imagination. 125 years after its formulation by Cantor, and 103 years after Hilber put it in the first place of his list, of problems for the twentieth century, the Continuum Hypothesis will keep on being, undoubtedly, one of the most important problems of the mathematics of the twenty-first century.

The reflection about fundamental mathematical questions has generated the theory of sets as we know it today. The very fact that it has been possible to unify all classical mathematics into one single maximally general theory, is a remarkable milestone. But, beyond any of this, the richness, originality and power of the ideas generated in set theory, make them applicable to almost all the areas of mathematics. With increasing frequency, forcing, large cardinals, axioms of determinacy, descriptive set theory, infinitary combinatorics, etc. find applications in the resolution of problems that had been open for many years in areas such as Topology, Algebra, Real and Complex Analysis, Functional Analysis, Measure Theory, etc. In the twenty-first century, the ideas recently introduced by Shelah, Todorcevic and, most importantly, Woodin, among others, not only will contribute to the resolution of many mathematical problems, already known or new ones that will emerge, but also will enhance substantially, as set theory has already done through the twentieth century, our understanding of the mathematical universe.

“What is Mathematics?”

February 1, 2012 Leave a comment

It might be a little annoying that a description of mathematical activity that satisfies everyone has not yet been found. People have been doing, teaching and learning mathematics for centuries, in schools, colleges and universities throughout the world. Math is something there for all to see. So, what is being asked for? Moreover, what could be learnt from any such description, given that it is of something already known? There are several reasons for this situation.

Usually, when such a kind of question in relation to some activity (medicine, for example) is asked, there is behind it some concern with a potential or actual source of professional intrusion, that is, a concern with some distortion undermining honest professional practice. I think that this is, in part, the motivation of some of Frank Quinn’s writings (links to a couple of them can be found at my previous post). Quinn worries that “a century of difficult adaptation to precision might be eroded by a new wave of heuristic work”. According to this author, heuristics may play their role in “mathematical science”, but they should not be used to justify results in “core mathematics”, because results so obtained cannot, and in fact must not, be considered reliable. In contrast, another author, Carlo Cellucci, defends that mathematics cannot be understood without the ‘analytic method’, which makes essential use of heuristics, or rules of discovery (see the Suggested Readings, below). So there is a demarcation problem here. “What is mathematics?” stands for “Which means are to be considered mathematical and which not? Should some open-endedness be allowed so that as new means are discovered they can be added to the collection of those already known? How would one then decide whether candidate means deserve the right to join the previously accepted ones?”, and so on.

But it is also common that what is sought with this kind of question is to put the discipline in a broader context than the most immediate to its practice, to look at it from a different perspective, so to speak, and there are a host of possible choices of perspective (historical, sociological, philosophical, educational, … and any combinations and ramifications of these). Actually, there could be so many different expectations behind a single formulation of the question that it would be impossible that one answer could satisfy all of them at once. Gian-Carlo Rota described several senses of the question “What is mathematics?” under various circumstances in a chapter of the Lezioni Napoletane, ed. by F. Palombi, La Città del Sole, Naples 1999 (an English version of this chapter, entitled What “Is” Mathematics?, can be found here. I thank Cellucci for the reference). Rota concludes that the feeling of wonder expressed by this question is the start of a philosophical journey that will eventually disclose the ‘conditions of possibility’ of mathematics. Maybe Quinn would make the point that this ought to be a scientific journey, rather than a philosophical one. It might seem strange, however, that to understand or to explain what is mathematics one needs to offer an explanation in terms of ‘conditions of possibility’ (just imagine one had to give an account of carpentry or tailoring in such terms). This approach to mathematics springs from Kant’s transcendental idealism in relation to science and the general problem of knowledge as a response to Descartes’ methodological skepticism. But there is also a modern motivation for it. In connection with Gödel’s realism (I borrow here terms from Penelope Maddy) ‘common features of [mathematics] are taken to derive their justification’ from ‘Robust Realism’ for which ‘the justification — or lack of justification — for mathematical methods is based on a metaphysical account of its subject matter’ (Maddy [2010], p. 87). By calling for metaphysical or epistemologycal accounts, one attempts to leverage the exploration of the frontiers of incompleteness, trying to answer exactly the same questions as above but now with an entirely different purpose in mind.

It is, hence, natural that different answers exist to the question “What is Mathematics?” each of them in relation with a different purpose or throwing light on a different aspect of mathematical activity.

Conditions of Possibility and Objectivity

If one had to look for what makes it possible that human beings do mathematics probably a good place to start would be our ability to establish precise rules and accurately follow them, checking whether each rule has been perfectly observed, and to make plans and carry them through, controlling that they produce the desired results; all of this together with our ability to disregard any rules and change any plans at convenience.

Creating rules is for our minds like creating the images or the sounds of our perception. It happens naturally and possibly it is creating rules what underpins the meaningfulness of our experiences. When we think, both rules learnt in the past and new rules emerging from current experience come into play. Ideas, concepts, etc. are ultimately systems of rules that are frequently involved in our thinking processes. Moreover, what we call reality could be understood as just one of those systems. All of them are systems of rules with an instrumental value. Mathematical ideas, in this sense, are no exception: they are an aid for thinking and understanding. Mathematics could then be seen as a style of reasoning and looking at things. But this, all alone, does not seem to explain its objectivity.

Richard Feynman, through Ralph Leighton, in “Surely You’re Joking, Mr. Feynman!”: Adventures of a Curious Character, Part 2 “The Princeton Years”, under the section “A Different Box of Tools”, shares with us some ideas I personally find very interesting.

“trivial” means “proved” […] mathematicians can prove only trivial theorems, because every theorem that’s proved is trivial […] there are never any surprises […] mathematicians only prove things that are obvious

I can find at least two senses in which what these words suggest could be regarded as true:

1. As soon as a proof of a theorem has been found, the logical connection between the premises and the conclusion, with the aid of the proof, becomes transparent. A proof can then be seen as a chain of platitudes.

2. Axioms fix beforehand which statements are theorems and which are not. What you get is what you put in the axioms. Therefore, there is no room for surprises.

I bet there isn’t a single theorem that you can tell me what the assumptions are — and what the theorem is in terms I can understand — where I can’t tell you right away whether it’s true or false

Feynman is claiming neither that all mathematical theories are decidable nor that he is an oracle. Also, by theorem here he refers to any statement known to be true or false after it, or its negation, has been proved from the assumptions (theorems are always true by definition). Even after these caveats, it is still a bold claim. Without the aid of the proof, in contrast to what has been said just above, most theorems are anything but obvious. He explains the method that justifies this exhibition of self confidence (emphases are mine):

I had a scheme, which I still use today when somebody is explaining something that I’m trying to understand: I keep making up examples. For instance, the mathematicians would come in with a terrific theorem, and they’re all excited. As they’re telling me the conditions of the theorem, I construct something which fits all the conditions. You know, you have a set (one ball) — disjoint (two balls). Then the balls turn colors, grow hairs, or whatever, in my head as they put more conditions on. Finally they state the theorem, which is some dumb thing about the ball which isn’t true for my hairy green ball thing, so I say, “False!”
If it’s true, they get all excited, and I let them go on for a while. Then I point out my counterexample.
“Oh. We forgot to tell you that it’s Class 2 Hausdorff homomorphic.”

I adore this passage. I think it is a really good description of the process of understanding something. Moreover, together with the previous one, it is a perfect introduction to the issue of objectivity in mathematics. In a nutshell: If something is provable for some person, then it is provable for every person (me included!). Actually, knowing that a statement (or its negation) is a theorem, one could program a computer to produce one proof after another and simply wait until the proof of the statement (or its negation) is found. But I think this is certainly not what Feynman did have in mind. Moreover, he clearly manifests that if a counterexample can be found then either assumptions were wrongly understood or the statement is simply not true.

[…] although the mathematicians thought their topology theorems were counterintuitive, they weren’t really as difficult as they looked. You can get used to the funny properties of this ultra-fine cutting business and do a pretty good job of guessing how it will come out.

Again, emphasis is mine. This passage also points to something important: that some theorems might seem counterintuitive is probably only because the assumptions of the theory are not correctly understood.

The objectivity of mathematics would seem, at least in part, to be explained by the objectivity of logic. If B follows from A it is a waste of time to look for a counterexample.

Here go some more passages for reflection, now from QED: The Strange Theory of Light and Matter:

We physicists are always checking to see if there is something the matter with the theory. That’s the game, because if there’s something the matter, it’s interesting!

What could go wrong with a mathematical theory? As I have been saying, there are two possibilities:

1. The theory is inconsistent.

2. The theory is misunderstood.

you might think you do not understand what I am telling you [because] while I am describing to you how Nature works, you won’t understand why Nature works that way. But you see, nobody understands that.

I think that this message is important because in philosophy we make many questions, but science necessarily has to choose very well which battles to fight, its success depends critically on it. The same happens with mathematics. Mathematics has to make decisions as to which questions to answer, and that is only a part of mathematical activity.

It is not a question of whether a theory is philosophically delightful, or easy to understand, or perfectly reasonable from the point of view of common sense.

The question is that a theory makes good predictions. What could it be for a mathematical theory that it made good predictions? It could be revealing the strains of mathematical fruitfulness, as Penelope Maddy would possibly defend.

I’m going to explain to you what the physicists are doing when they are predicting how Nature will behave, but I’m not going to teach you any tricks so you can do it efficiently.

The price we have to pay for doing things efficiently is going technical, but what’s that exactly? This is another important issue to deal with because this is, again in part, what mathematics is about.

I wish I had more time to develop these ideas, and to compare and comment the different author’s opinions, but this is impossible for me right now. I hope to be able to do that soon.

Suggested Readings

Cellucci, Carlo

[2011a] ‘Top-down and Bottom-Up Philosophy of Mathematics’, forthcoming in Foundations of Science.

[2011b] ‘Philosophy of Mathematics: Making a Fresh Start’, paper presented at the 14th Congress of Logic and Philosophy of Science. Nancy. July 19-26, 2011.

[2011c] ‘Indiscrete Variations on Gian-Carlo Rota’s Themes’, in M. Pitici (ed.), The Best Writings on Mathematics 2010 (pp. 311-329). Princeton: Princeton University Press.

[2008a] ‘The nature of mathematical explanation’, Studies in History and Philosophy of Science, vol. 39, pp. 202-210.

[2008b] ‘Why proof? What is a proof?’, in R. Lupacchini and G. Corsi (eds.), Deduction, computation, experiment. Exploring the effectiveness of proof (pp. 1-27). Berlin: Springer.

Maddy, Penelope

[2011] Defending the Axioms. Oxford: Oxford University Press.

Frank Quinn’s A Revolution in Mathematics?

December 24, 2011 Leave a comment

Frank Quinn, A Revolution in Mathematics? What Really Happened a Century Ago and Why It Matters Today, Notices of the AMS 59 (January 2011), 31-37.

On a first read, it seemed to me that despite the flashy title of Quinn’s article, it was closer to a publication of a political nature than of an academic one. It is a defense of what the author calls core mathematics, the kind of mathematics that needs no other justification than the pursue of purely mathematical goals, that is, whose methods are not aimed at the resolution of problems that need to save appearances.

Quinn suggests that “the changes that took place in 1890-1930” were most importantly changes in methodology. “Intuitive formulations” were left behind in favor of “explanation without rule violations”. The reliability of conclusions is guaranteed only by carefully following a system of rules and procedures. It is only by the practice of these rules and procedures that mathematical notions become meaningful, it is only by working with those notions in the way laid down by these rules and procedures that they can be correctly “internalized”.

In her paper Objectivity in Mathematics, Penelope Maddy recognizes that “what guides our concept formation, beyond the logical requirement of consistency, is the way some logically possible concepts track important mathematical strains that the others miss”, where this “importance” is to be identified with mathematical fruitfulness. “Mathematical goals are only proper insofar as satisfying them furthers our grasp of the underlying strains of mathematical fruitfulness”. I believe that Quinn also recognizes this when he talks about “fine-tuning” axioms to optimize their properties and searching for “opportunities where a new definition might organize a body of material”.

Quinn argues that there is real danger of “marginalization” of core mathematics leading to a “scientific analogue of osteoporosis”, since it “provides the skeleton that supports the muscles and sinews of science and technology”. But most importantly, it looks like he thinks the demarcation problem for core mathematics is already solved (“Mathematicians have joined fish and birds in doing something very well without any idea how!”) while it remains open for applied mathematics (“there is no way to ensure that all alternatives have been imagined”). It might also seem that Quinn holds here an extreme position that views applied mathematics as an eclectic subject that builds ad hoc solutions on demand, but I do not want to address this issue right now.

P. M. Cohn, in Algebra. Volume 1. Second Edition, John Wiley & Sons Ltd (Reprinted February 1993), p. 1, wrote “Mathematics is the study of relations between certain ideal objects such as numbers, functions and geometrical figures. These objects are not to be regarded as real, but rather as abstract models of physical situations. […] If our mathematical system is to serve as a model of reality we must know how to recognize true assertions, at least in principle (even though in practice some may be hard to prove). When the object of discussion is intuitively familiar to us -as in the case of natural numbers- we take certain assertions recognized to be true as our axioms and try to derive all other assertions from them. Once that is done, we can forget the intuitive interpretation and regard our objects as abstract entities subject to the given axioms. When we come to apply our system to a concrete case, we need to find an interpretation for each notion introduced and verify that each axiom holds in the interpretation; we are then able to conclude that all the assertions derived from the axioms also hold”.

I think it is this description of mathematics, natural as it seems, which is, if not vehemently contested, at least highly nuanced by Frank Quinn in his writings. I suggest, particularly, reading his Contributions to a Science of Contemporary Mathematics to see what I mean. I will address the issue of the demarcation problem for mathematics in my next post.

Understanding and Knowledge

October 10, 2011 Leave a comment

This is a continuation of my earlier posts where I have been trying to explain a possible reaction to Platonism. I am sorry for the chaotic exposition.

We all humans must arrive at any ideas by ourselves, even when other humans try to guide us by means of a common language or by whatever other means. To speak of transmission of ideas is simply that, a way of speaking; but a misleading one. Ideas are actually personal and nontransferable. From this standpoint, two interesting problems arise, the problem of creativity, or how we arrive at any ideas, and the problem of communication, or how we are able to share any ideas with other people. Here by creativity I do not mean only inventiveness. There seems to be little arbitrariness in how we arrive at certain ideas. See, for example, What Infants Know: The New Cognitive Science of Early Development by Mehler and Dupoux.

There is an obvious interconnection between both problems. Communication without creativity would be impossible. Moreover, issues related to meaning, understanding and education may benefit from the solutions to them both.

We arrive at ideas as a result of the way our brains are able to interact with the rest of world (which includes other brains). Our brains process and organize all the information available through that interaction. We learn that there is order in the world because our brains are able to give order to it with more or less success. An object of some kind is an object of that kind because it conforms to our experiential ideas of what being an object of that kind is. But note that this process of identifying what is the case bears no relation to truth or falsity, only to meaning or understanding.

So, I do not think there is arbitrariness in the ideas our brains arrive at, but I do think it is important to make a clear distinction between understanding and knowledge. When does knowledge appear?

I think that in general we value the degree of reality of ideas according to our experiences (which includes exchange of opinions with other humans) but also according to our personal or cultural tastes and preferences. The issue of what is true or false then is just a matter of agreement on “what best describes what is the case”.

Categories: Philosophy

Exploring the Frontiers of Incompleteness

September 7, 2011 Leave a comment

I suggest that you do not miss this great project of Peter Koellner: Exploring the Frontiers of Incompleteness.

It aims to address these two all important problems:

(1) Do the questions that are independent of the standard axioms admit of determinate answers?

(2) If so then what are those answers and how might we go about determining them?

by investigating the positions of twelve different authors, each in one workshop, and all of them in a final master-workshop. Materials are going to be available for reading before each event. It is a great opportunity to learn what these outstanding researchers with different points of view about the same subject matter think of each other’s work.

My reaction against Platonism

August 14, 2011 7 comments

In my previous post I started to sketch a brief account of what could become my personal reaction against Platonism. Here I continue to define a bit more that position.

I think of the work of the philosopher-scientist as one of exploration of reason. By exploration I understand here an active empirical activity (not a passive observational one) consisting in searching for ideas and testing them. By ideas I understand here such diverse things as intuitions, notions, concepts, stories, intentions, plans, conceptions, arguments or theories. By reason I understand the human faculty of reasoning, including the ability to arrive at ideas (by whatever means, be them perceptual or upon reflection).

For an idea to stand the test of time it must have proved useful for a particular purpose, a currently relevant purpose for which no other idea, to date, has proved as effective as the successful one.

So, among the many difficulties the philosopher-scientist faces is: how is the potential of an idea to be evaluated?

I remember having read from Tim Gowers that for him the best proofs are those that are memorable. To be useful and effective it certainly seems desirable that ideas are memorable. Indeed it is a tempting idea to regard well developed theories or conceptions as networks of ideas that work especially well as a sort of mnemonic guides leading naturally to the desired results. But clearly memorability cannot be enough. Gödel, in a famous, widely quoted, passage, identifies success with fruitfulness in consequences, in particular verifiable consequences.

It might be tempting to regard mathematical objects, or structures, as real. But this is just a natural outcome of the fact that mathematical ideas have a working fluency of their own, that once they are properly understood everyone is going to get to the very same conclusions.

On Pure Reason

July 5, 2011 Leave a comment

I expose here a personal, tentative, certainly naive, and possibly not original position about reason. I ask readers to be as benevolent as possible.

In the introduction to his The Provenance of Pure Reason: Essays in the Philosophy of Mathematics and Its History, William W. Tait aligns with Plato to hold that “mathematics is the domain of pure reason”.

He says, “Wherever reason is in action in our encounters with the natural world, there lies behind it, if only implicitly, an idealized world, the world of mathematical structures, where concepts have hard edges — in Plato’s terms, behind the world of things that both are and are not, there lies the world of things that are absolutely. […] even in its application to empirical science, the value of mathematics lies precisely in its separation from empirical matters, in its being entirely a creature of reason. For only with this separation can we distinguish between the question of the adequacy and coherence of the mathematical theory, in itself, in terms of which we wish to understand the phenomena and the question, given its coherence, of whether or not the rational structure it describes is the right one for the phenomena at issue. […] Our sense perceptions, for example, involve concepts: we perceive colors, sounds, trees, and people as colors, sounds, trees, and people — in familiar terms, our ordinary sense experience is ‘theory laden.’ […] on the other side, there is conceptual knowledge which involves no empirical element. […] we must reject […] that concepts without intuition are empty”.

One thing is that there are concepts or notions to which we arrive at by different means and another thing is that there is absolutely no empirical element in the kind of conceptual knowledge present in mathematical practice. How could we talk about the coherence of a theory or the sharpness of a notion without ever having tested them?

Tait later in the same introduction says that he remains “convinced by Wittgenstein that basic rules of computation and proof – for adding numbers, for example, and making inferences – cannot be regarded as private: the norms of correctness are in the public domain, not in our heads”.

I think that behind this conviction lies the fact that one of the most basic tests a notion has to pass is that it must be communicable.

Let me use this funny story about Vervet monkeys to illustrate what I have in mind. I reproduce it here:

“Vervet monkeys give different alarm calls to different predators. Recordings of the alarms played back when predators were absent caused the monkeys to run into trees for leopard alarms, look up for eagle alarms, and look down for snake alarms. Adults call primarily to leopards, martial eagles, and pythons, but infants give leopard alarms to various mammals, eagle alarms to many birds, and snake alarms to various snakelike objects. Predator classification improves with age and experience.”

The hearing of an alarm call drives Vervet monkeys to a state in which they are aware of the possible presence of a specific kind of predator. We know that because of their reaction to different alarm calls but also because we are able to arrive at that very same notion too. In the case of human reason we similarly experiment with ourselves and our own notions.

%d bloggers like this: