An introductory guide to modern logic

Welcome! We are about to start a journey through modern logic, in which we will visit two central results of this branch of mathematics, which are often misunderstood and sometimes even misquoted: Löb’s theorem and Gödel’s second incompleteness theorem.

Modern logic can be said to have been born in the ideas of Gödel regarding the amazing property of self-reference that plagues arithmetic. Badly explained, this means that we can force an interpretation over natural numbers which relates how deductions are made in arithmetic to statements about arithmetical concepts such as divisibility.

This guide is targets people who are already comfortable with mathematical thinking and mathematical notation. It requires however no specific background knowledge about logic or math to be read. The guide starts lightweight and gets progressively more technical.

Formal proofs

What is logic anyway? A short explanation refers to the fact that we humans happen to be able to draw conclusions about things we have not experienced by using facts about how the world is. For example, if you see somebody entering the room and is dripping water, then you infer that it is raining outside, even though you have not seen the rain yourself.

This may seem trivial, but what if we are trying to program that into a computer? How can we capture this intuition of what reasoning is and write down the rules for proper reasoning in a way precise enough to be turned into a program?

To accomplish that is to do logic. The ultimate goal is to have an efficient procedure that if followed will allow to deduce every true consequence of a set of premises, so simple that it could be taught to a dumb machine which only follows mechanical operations.

So let’s try to formalize reasoning!

We are going to draw inspiration from mathematicians and see what they are doing when proving a result.

Though the method is often convoluted, non linear and hard to understand, in essence what they are doing is writing down some hypothesis and facts that they assume to be true. Then they apply some manipulation rules which they assume that when applied on true premises allow to derive true facts. After iterating this process, they arrive to the wanted result.

Getting formal

Well, let’s wrap together this intuitive process in a formal definition.

A proof of a sentence \(\phi\) is a sequence of well formed sentences such that every sentence in the sequence is either an axiom or can be derived of the previous sentences using a derivation rule, and the final sentence in the sequence is \(\phi\). A sentence which has a proof is called a theorem.

Don’t be scared by all the new terms! Let’s go briefly over them.

A well formed sentence is just a combination of letters which satisfies a certain criteria which makes them easy to interpret. The sentences we will be dealing with are logical formulas, which use a particular set of symbols such as \(=, \wedge, \implies\). Our goal is to assign truth values to sentences. That is, to say if particular sentences are true or false.

An axiom is something that we assume to be true without requiring a proof. For example, a common axiom in arithmetic is assuming that \(0\) is different than \(n+1\) for all natural numbers \(n\). Or expressed as a well formed sentence, \(\forall n. 0 \not = n+1\). noteThe symbol \(\forall\) is an upside-down A and means “for all”.

A derivation rule is a valid basic step in a proof. Perhaps the simplest example is modus ponens . Modus ponens tells you that if you have deduced somehow that \(A\implies B\) and also \(A\), then you can deduce from both facts \(B\).

Those three components constitute the basis of a logical system for reasoning: a criteria for constructing well formed sentences, a set of axioms which we know to be true noteThe set of axioms does not have to be finite! In particular, we can specify a criteria to identify infinitely many axioms. This is called an axiom schema and some derivation rules which allow us to deduce new conclusions.

For a rather silly example, let’s consider the system \(A\) with the following components:

  • Well formed sentences: the set of words in the Webster dictionary.

  • Axioms: The word ‘break’.

  • Deduction rules: If a word \(w\) is a theorem of \(A\), then so it is a word which only differs in one letter of \(w\), either by adding, substracting or replacing it anywhere in \(w\).

Thus the word ‘trend’ is a theorem of \(A\), as shown by the 4 steps long proof ‘break—bread—tread—trend’. The first sentence in the proof is an axiom. Every sentence after is deduced from our deduction rule using the previous sentence as a premise. <div><div>


Given how we defined a logical system, there is a certain degree of freedom we have, as we can choose the underlying sets of well formed sentences, the axioms and the deduction rules. However, as shown in the example above, if we want to make meaningful deductions we should be careful in what components do we choose for our logical system.

The criteria we use to choose these components is intertwined with the model we are trying to reason about. Models can be anything that captures an aspect of the world, or just a mathematical concept. To put it simply, they are the thing we are trying to reason about.

Depending on what kind of entities the model has, we will lean towards a particular choice of sentences. For example, if our model consists in just facts which are subject to some known logical relations we could use the language of propositional logic. But if our model contains different objects which are related in different ways, we will prefer to use the language of first order logic.

Each model has an interpretation associated, which relates the terms in the sentences to aspects of the model. For all practical purposes, a model and its interpretation are the same once we have fixed the language, and thus we will use the terms as synonyms.

Then, depending on how the model evolves and what things can we infer from what facts we should choose some deduction rules. Together, the sentences and the deduction rules specify a class of possible models. If the deduction rules correctly capture the properties of the class of models we had in mind, in the sense that from true premises we really derive true consequences, then we will say that they are sound.

Intuitively, we say that the choice of language and deduction rules decides the “shape” that our models will have. Every model with such a shape will be part of the universe of models specified by those components of a logical system.

With this imagery of a class of models, we can think of axioms as an attempt to reduce the class of models and pin down the concrete model we are interested in, by choosing as axioms sentences whose interpretation is true in only the model we are interested in and in no other model notesadly, there are occasions in which this is just not possible.

A property tightly related to soundness is completeness. A system is complete if every sentence which is true under all interpretations which satisfy our axioms has a proof.

It is important to realize that logical systems can be talking about many models at once. In particular, if two interpretations satisfy the axioms but disagree on the truth of a particular sentence, then that sentence will be undecidable in our model.

This suggest a technique for proving independence of logical statements. To show that a certain sentence is independent from some axioms, construct two models which satisfy those axioms but contradict each other on the value of the statement. A nice example of this is how Lobachevsky showed that Euclides Fifth postulate is independent from the other four postulates.

add more examples everywhere

Introducing PA

For our purposes, we will stick to a particular logical system called Peano arithmetic. This particular choice of axioms and deduction rules is interesting because it reflects a lot of our intuitions about how numbers work, which in turn can be used to talk about many phenomena in the real world. In particular, they can talk about themselves.

Before we move on to this “talking about themselves” business I am going to introduce more notation. We will refer to Peano Arithmetic as \(PA\). If a sentence \(\phi\) follows from the axioms of \(PA\) and its deduction rules noteie, there is a proof of \(\phi\) using only axioms and deduction rules from \(PA\), we will say that \(PA\vdash \phi\), read as “$PA$ proves \(\phi\)”.

Self reference and the provability predicate

Let’s try to get an intuition of what I mean when I say that numbers can talk about themselves.

The first key intuition is that we can refer to arbitrary sequences of characters using numbers. For example, I can declare that from now onwards the number \(1\) is going to refer to the sequence of letters “I love Peano Arithmetic”. Furthermore, using a clever rule to relate numbers and sentences I can make sure that every possible finite sentence gets assigned a number noteThis is called encoding.

A simple encoding goes by the name of Gödel encoding, and consists of assigning to every symbol we are going to allow to be used in sentences a number. For example, \(=\) could be \(1\), and \(a\) could be \(0\), and so on and so forth. Then we could encode a sentence consisting of \(n\) symbols as the number \(2^{a_1}3^{a_2}5^{a_3}\cdots p(n)^{a_n}\). That is, we take the number which is the product of the \(n\)th first primes with exponents equal to the assigned numbers.noteThis is possible because every number can be decomposed as a unique product of primes.

The process can be repeated to encode sequences of sentences as single numbers. Thus, we can encode whole proofs as single numbers. We could also encode sequences of sequences of sentences, but that is going too far for our purposes.

whole example of Gödel encoding, define carefully notation of godelquotes

So the point is, we can exchange sentences by numbers and vice versa. Can we also talk about deduction using numbers?

It turns out we can! With the encoding we have chosen it is cumbersome to show, but we can write a predicate \(Axiom(x)\) noteA predicate is a well formed sentence in which we have left one or more “holes” in the form of variables which we can substitute for literal numbers or quantify over. For example, we can have the predicate \(IsEqualTo42(x)\) of the form \(x = 42\). Then \(PA\vdash IsEqualTo42(42)\) and \(PA\vdash \exists x IsEqualTo42(x)\), but \(PA\not\vdash IsEqualTo42(7)\) in the language of Peano Arithmetic such that \(PA\vdash Axiom(\textbf{n})\) if and only if \(n\) is a number which encodes an axiom of \(PA\).

Furthermore, deduction rules which require \(n\) premises can be represented by \(n+1\) predicates \(Rule(p_1, p_2,..., p_n, r)\) which is provable in \(PA\) if and only if \(p_1, ...., p_n\) are numbers encoding valid premises for the rule and \(r\) is the corresponding deduced fact.

A bit more of work and we can put together a predicate \(Proof(x,y)\), which is provable in \(PA\) if and only if \(x\) encodes the valid proof of the sentence \(y\). Isn’t that neat!?

Since we are not too interested in the proof itself, and more in the fact that there is a proof at all, we are going to construct the provability predicate \(\exists x. Proof(x,y)\), which we will call \(\square_{PA}(y)\).note\(\exists\) is a backwards E and means “there exists”. So then, \(\square_{PA}(x)\) literally means “There is a proof in \(PA\) of \(x\)”.

Thus we can say that if the number \(\ulcorner 1+1=2 \urcorner\) corresponds to the sentence \(1+1=2\)noteThis notation remains in effect for the rest of the article., then \(PA\vdash \square_{PA}(\ulcorner 1+1=2 \urcorner)\). That is, \(PA\) proves that there is a proof in \(PA\) of \(1+1=2\).

Now, the predicates we have seen so far are quite intuitive and nicely behaved, in the sense that their deducibility from \(PA\) matches quite well what we would expect from our intuition. However, adding the existential quantifier in front of \(Proof(x,y)\) we get some nasty side effects.

Warning! Really bad analogy incoming!

The thing is that \(PA\) “hallucinates” numbers which it is not sure whether they exist or not. Those are no mere natural numbers, but infinite numbers further in the horizon of math. While \(PA\) cannot prove their existence, it neither cannot prove its non existence. So \(PA\) becomes wary of asserting that proofs do not exist for a certain false sentence. After all, one of those non standard numbers may encode the proof of the false sentence! Who is to prove otherwise?

Can we patch this somehow? Maybe by adding more axioms or deduction rules to \(PA\) so that it can prove that those numbers do not exists? The answer is yes but not really. While it is doable in principle, the resulting theory becomes too difficult to manage, and we can no longer use it for effective deduction noteTechnically, \(PA\) loses its semidecidability that way..

We will now proceed to prove two technical results which better formalize this idea: Löb’s theorem and Gödel’s Second Incompleteness Theorem.

Löb’s theorem

Löb’s result follows from the intuitive properties of deduction that the provability predicate actually manages to capture. This points to the fact that it is not our definition what is wrong, but rather to a fundamental impossibility in logic.

The intuitive, good properties of \(\square_{PA}\) are known as the Hilbert-Bernais derivability conditions, and are as follows:

  1. If \(PA\vdash A\), then \(PA\vdash \square_{PA}(\ulcorner A\urcorner)\).

  2. \(PA\vdash \square_{PA}(\ulcorner A\rightarrow B\urcorner) \rightarrow [\square_{PA}(\ulcorner A \urcorner)\rightarrow \square_{PA}(\ulcorner B \urcorner)]\)

  3. \(PA\vdash \square_{PA}(\ulcorner A\urcorner) \rightarrow \square_{PA} \square_{PA} (\ulcorner A\urcorner)\).

Let’s go over each of them in turn.

1) says reasonably that if \(PA\) proves a sentence \(A\), then it also proves that there is a proof of \(A\).

2) affirms that if you can prove that \(A\) implies \(B\) then the existence of a proof of \(A\) implies the existence of a proof of \(B\). This is quite intuitive, as we can concatenate a proof of \(A\) with a proof of \(A\rightarrow B\) and deduce \(B\) from an application of modus ponens.

3) is a technical result which states that the formalization of 1) is provable when we are dealing with sentences of the form \(\square_{PA}(\ulcorner A \urcorner)\).

One more ingredient is needed to derive Löb’s: the diagonal lemma, which states that for all predicates \(\phi(x)\) there is a formula \(\psi\) such that \(PA\vdash \psi \leftrightarrow \phi(\ulcorner \psi \urcorner)\).

The details of the proof can be found in Proof of Löb’s theorem.

The details are not essential to the main idea, but it can be illustrative to work through the formal proof. Plus the intuition behind related to the non-standard numbers we talked about before.

What is really interesting is that now we are in position to enunciate and understand Löb’s theorem!

Löb’s theorem

If \(PA\vdash \square_{PA}(\ulcorner A\urcorner) \rightarrow A\), then \(PA\vdash A\).

(or equivalently, if \(PA\not\vdash A\) then \(PA\not\vdash \square_{PA}(\ulcorner A\urcorner) \rightarrow A\)).

Talk about an unintuitive result! Let’s take a moment to ponder about its meaning.

Intuitively, we should be able to derive from the existence of a proof of \(A\) that \(A\) is true. After all, proofs are guarantees that something really follows from the axioms, so if we got those right and our derivation rules are correct then \(A\) should be true. However, \(PA\) does not trust that being able to find the existence of a proof is enough to make \(A\) true!

Indeed, he needs to be able to see by himself the proof. In other words, there must be a \(n\) such that \(PA\vdash Proof(\textbf n, \ulcorner A\urcorner)\). Then he will trust that it is indeed the case that \(A\).

I will repeat that again because it looks like a tongue twister. Suppose somebody came and assured \(PA\) that from the axioms of \(PA\) it follows that there exists a number \(n\) satisfying \(Proof(\textbf n,\ulcorner A\urcorner)\). Then \(PA\) would say, show me the proof or I am going to assume that that \(n\) is actually a non standard number and you a friggin’ liar.

If somebody just comes saying he has a proof, but produces none, \(PA\) becomes suspicious. After all, the proof in question could be a non-standard proof encoded by one of the dreaded non standard numbers! Who is going to trust that!

Gödel II

And finally we arrive to Gödel’s Second Incompleteness Theorem, perhaps the most widely misunderstood theorem of all mathematics.

We first need to introduce the notion of consistency. Simply enough, a logical system is consistent if it does not prove a contradiction, where a contradiction is something impossible. For example, it cannot ever be the case that \(P\wedge \neg P\), so \(P\wedge \neg P\) is a contradiction no matter what \(P\) is. We use the symbol \(\bot\) to represent a contradiction.

The statement of GII is as follows:

Gödel Second Incompleteness Theorem (concrete form) If \(PA\) is consistent, then \(PA\not \vdash \neg \square_{PA}(\bot)\)

Notice that GII follows quite directly from Löb’s theorem. Actually, it is the case that Löb’s theorem also follows from GII, so both results are equivalent.

This result can be interpreted as follows: you cannot make a system as complex as \(PA\) in which you can talk about deduction with complete certainty, and thus about consistency. In particular, such a system cannot prove that he himself is consistent.

The result is startling, but in the light of our previous exposition is clear what is going on! There is always the shadow of non standard numbers menacing our deductions.


And that concludes our introduction to formal logic!

To recall some important things we have learned:

  • Logical systems capture the intuition behind deductive reasoning. They are composed of axioms and deductive rules that are chained to compose proofs.

  • Simple logical systems that are used to talk about numbers, such as \(PA\), can be interpreted as talking about many things through encodings, and in particular they can talk about themselves.

  • There are expressions in logic that capture the concepts involved in deductions. However, the most important of them, the provability predicate \(\square_{PA}\), fails to satisfy some intuitive properties due to the inability of \(PA\) to prove that non standard numbers do not exists.

  • Löb’s theorem says that if \(PA\) cannot prove \(A\), then it can neither prove that from \(\square_{PA}(\ulcorner A\urcorner\) follows \(A\).

  • \(PA\) cannot prove its own consistency, in the sense that it cannot prove that the standard predicate is never satisfied by a contradiction.

If you want to get deeper in the rabbit hole, read about model theory and semantics or modal logic.


  • Mathematics

    Mathematics is the study of numbers and other ideal objects that can be described by axioms.