2. Background: Intuitionist Logic, Constructive Mathematics, Type Theory

To use Lean, you don't need to subscribe to any particular approach to the foundations of mathematics. However, the design of the system becomes a lot easier to understand if you have had some exposure to the ideas of intuitionist logic, constructive mathematics, and type theory. So let's touch briefly on these points.

2.1. Models of arithmetic🔗

In arithmetic, we seek to determine the truth values of statements like

def one_add_zero :=
  1 + 0 = 1

-- The claim that for every natural number `x`,
-- there is a `y` that is larger.
def no_largest_nat :=
  ∀ x : ℕ, ∃ y, x < y

-- Using the predicate that expresses that `x` is a prime...
def is_prime x :=
  (x ≠ 1) ∧ ∀ y z, x = y*z → (y = 1) ∨ (z = 1)
-- ...we can state Goldbach's conjecture: For every `x`,
-- if `x>1` then there are natural numbers `y` and `z`
-- s.t. both are prime and `2x` is the sum of them.
def goldbach :=
  ∀ x, x > 1 → ∃ y z,
    (is_prime y) ∧ (is_prime z) ∧ 2*x = y + z

Technically, these are first-order formulas involving natural numbers, addition, and multiplication. We're already using Lean notation that will be introduced systematically only later, but the claims should be parsable. You may have noted that we use → for implications, rather than the more common ⇒. That's no coincidence, as will become clear soon.

One would think that the first step of engaging with arithmetic is to agree on what the natural numbers and the operations + and * actually are. (In logical terms: to select a model). But this isn't the case! Instead, it is common practice to select a small number of axioms that any sensible model of the naturals will certainly fulfill; to then select some truth-preserving system for making logical inferences; and to work out the theorems that are provable by this logical system from the axioms.

The best-known choice is, of course, Peano arithmetic (PA). Its axioms include such sleeping pills as

(Symb1) zero is a natural number.
(Symb2) There is a "successor function" succ such that if x is a natural number, so is succ x.
(Eq1) zero ≠ succ x ("zero is not a successor")
(Eq2) succ x = succ y → x = y ("succ is injective")

(We'll meet the rest later).

PA is expressive enough to treat basically all of discrete math. However, it does not uniquely characterize the natural numbers! (That's necessarily so, as a consequence of Gödel's incompleteness theorem, together with -- you guessed it -- Gödel's completeness theorem. (I know...). There are also fairly explicit constructions of monstrous non-standard models of PA, well worth some time on Wikipedia.

We will benefit from identifying a model of arithmetic that plays well with computerized proof assistants. So, let's ask what are the natural numbers?

A formal answer is "the von Neumann ordinals constructible in ZF set theory". But that's not very satisfactory for our purposes. It shifts the problem from "what are the natural numbers" to "what are sets"? Out of the frying pan, right into the fire.

Instead, let's adopt the philosophical stance of realism and posit that the statements of arithmetic express truths that are independent of any formal mathematical activity.

To test the level of acceptance of this mindset in the general population, I've asked my kid whether 2 + 1 = 3 was true before humans existed. "Sure. Two dinosaurs meet one dinosaur. Now you got three.", she said, thereby having rediscovered the Aristotelian realist point of view that arithmetic reflects the properties of discrete physical objects. (Also, she still can't believe that my job pays the bills).

That's a fine answer as far as it goes, but for computerized mathematics, we'll find it more advantageous to work with an "algorithmic model", introduced next.

2.1.1. Arithmetic from recursive functions

Let's say we have a system capable of simple symbolic calculations: Assume the system can represent the symbol zero, and can recursively apply a function succ to it. For example, succ applied to zero will be the symbol succ zero, and succ applied once more gives succ succ zero (we've dispensed with parentheses for function application).

Let's provisionally identify the natural number n with the symbolic expression where succ has been applied n times to zero.

To see that this is a fruitful thing to do, add (huh!) the function plus defined by the two recursive rules

def plus: (n m : ℕ) → ℕ
  | n, zero   => n
  | n, succ k => succ (plus n k)

You should read this is an algorithmic prescription, like so: To compute plus n m, distinguish the two cases.

Either m = zero. Then, return n.
Or m = succ k is a successor. Then, call plus with parameters n k and return the successor of the result.

Clearly, these two rules have full coverage, in that they tell the system exactly what to do in every situation. What is more, the recursion actually terminates, because the second argument gets decremented in every call, and will thus eventually reach zero.

Let's see the algorithm in action:

   add (succ succ zero) (succ zero)
=> succ (add (succ succ zero) zero) -- Rule "succ"
=> succ (succ succ zero)            -- Rule "zero"
=> succ succ succ zero

Two dinosaurs, meet one dinosaur!

Multiplication can be similarly realized by a recursively defined function.

The charm of this model is that it is defined in terms of finite objects: Namely by a few instructions that specify the algorithms for computing successors, addition, and multiplication. The assumption that these make sense for inputs of unbounded size is only a very mild idealization of what is physically implementable.

From now on, we will identify arithmetic objects with the recursive functions model just described.

2.1.2. Notes

You might object that doing arithmetic in what amounts to a unary representation of natural numbers is horribly inefficient. But we'll soon see that for the purpose of proving theorems, this is largely irrelevant.
If you think all this is banal, be advised that Lean's Mathlib models a significant part of all of mathematics using not much more than the ingredients sketched here.

Two dinosaurs, meet one dinosaur!

2.2. Constructive logic

We now have an algorithmic model of numbers, +, and ×. Let's see whether we can likewise find an algorithmic way to think about logic.

2.2.1. Fights

It all started with a big fight. Around the turn of the 20th century, the success of mathematical methods for making practical predictions was undeniable: They correctly described the motions of the planets, the trajectory of artillery shells, and the odds in gambling (among other worthy applications). But more esoteric ideas started to become popular. Hilbert gave an existence proof of a finite set of generators for certain rings of polynomials, without so much as hinting at how to find them. Cantor started cataloging different kinds of infinities in set theory. Contemporaries like Kronecker and Brouwer doubted that any of that was meaningful. These's weren't polite disagreements. Careers, friendships, and people's sanity were on the line.

2.2.2. Brouwer, Heyting, and Kolmogorov🔗

As a result of this controversy, Brouwer, Heyting, and Kolmogorov developed an approach to logic in which any existence proof of a mathematical object automatically comes with an algorithm for constructing it.

At least Brouwer, a vocal partisan in the fight, was motivated by a fundamental rejection of non-constructive methods. Today, emotions have cooled. Mainstream mathematics accepts non-constructive arguments as valid. However, the work of BHK is still of interest for the pragmatic reason that it lends itself well to the design of computerized proof verifiers, and indeed, Lean is built on these ideas.

Let's have a quick look, before treating Lean's implementation in greater detail.

To prove a conjunction p ∧ q of two propositions p, q, one has to provide a proof of p and a proof of q.

Computationally, one can think of a proof of p ∧ q as a data structure that holds the two individual proofs. OK, that's pretty unremarkable.

To prove a disjunction p ∨ q of two propositions p, q, one has to provide a proof of p or a proof of q.

In particular, one has to know which of the two cases is realized!

Contrast this with classical logic, where one can always add the disjunction p ∨ ¬p to the list of proven propositions. This deduction rule is known as the law of the excluded middle (LEM). (Brouwer would not have admitted that p ∨ ¬p is true or even meaningful in general. Mainstream mathematics disagrees with this fundamentalist position, but does conceit that the LEM is not a constructive deduction rule).

To prove an implication p → q, one has to supply an algorithm that turns any proof of p into a proof of q.

Let me repeat: A proof of a theorem is an algorithm that turns its assumptions into its conclusions. I find this a pretty eye-opening perspective! In fact, proofs of implications and functions are essentially identical concepts in Lean. (Hence the use of "→" for implications, a notation more commonly associated with functions).

To establish a universal quantification ∀ x, p x for some predicate p, one has to provide an algorithm that takes x as its argument and returns a proof of p x.

From this point of view, a proof of an implication is just a special case of a proof of a universal quantification. Both are computable functions that map their arguments to proofs -- just that in the former case, the argument is required to be a proof of the antecedent, while in the latter case, they can range over arbitrary domains. (It will then come as little surprise that in Lean the symbol "∀" is just fancy notation for declaring a function type).

To prove a claim of existence ∃ x, p x for some predicate p, one has to provide a concrete element x together with a proof of p x.

No surprise here.

Proofs of negations ¬p are quite subtle. First, one chooses some proposition that manifestly has no proof. Call it False. Then, to prove ¬p, one has to show the implication p → False.

In other words: To prove ¬p, one has to supply a computable function that turns any proof of p into a proof of an unprovable proposition. (In Lean, it is syntactically impossible to express a proof of False, George Orwell-style).

Under this interpretation of "negation", a proof of ¬¬p can not in general be turned into a proof of p. In classical logic, ¬¬p → p is a tautology, called double negation elimination (DNE). In fact, DNE is equivalent to LEM (a cute exercise in Lean).

One can define a proof by contradiction as a deduction that relies on DNE. In this narrow sense, proofs by contradiction are not constructively valid. However, with a bit of practice, you'll find that many arguments that are billed as being "by contradiction" do not actually involve DNE. For example, the classic proof that √2 is not rational can be read as proving ∀ r s, ¬(√2=r/s). (Which, if you paid attention, you'll recognize is an algorithm that turns any pair r, s of natural numbers and a proof of the proposition √2=r/s into a proof of False).

2.2.3. Notes

The deductive system described by the BHK interpretation is known as intuitionist logic. That's not a very fitting name. The intuitionists weren't a bunch of hippies who advocated proof-by-vibes. Quite the opposite, the intuitionistically valid inference rules are a conservative subset of the classical ones.
Strictly speaking, intuitionist logic is one particular type of constructive logic. But my impression is that the term "intuitionist" carries philosophical connotations, while "constructivism" is more associated with a pragmatic point of view. (Maybe that's just me).

2.2.4. Case study: Induction

Let's see how we can use these concepts to explain the proof method of mathematical induction from a computational point of view.

Take a predicate p on the natural numbers, i.e. a function that associates with every n a proposition p n. Assume that someone gives us

a proof hb of p 0 ("base case"),
for every natural number k, a proof hi k of the implication p k → p (succ k) ("induction hypothesis").

(It's a Lean convention to use hp for an externally supplied proof of a proposition p. Think "hypothesis".)

I claim that we can now compute a proof p_for_all of the proposition ∀ n, p n. By the definition of what it means to prove a universal quantification, we need to construct for every n a proof of p n.

This is achieved by the following recursive function:

def p_for_all : ∀ n, p n := fun n => match n with
    | zero   => hb
    | succ k => (hi k) (p_for_all k)

Let's deconstruct this: p_for_all takes a natural number as an argument. By definition, it is either zero or a successor.

In the first case, return the proof hb of the base case.
For the second case, the system will first invoke p_for_all k, which returns a proof of p k. Now recall that by the definition of implication, hi k is a function converting proofs of p k to proofs of p (succ k). Thus applying (hi k) to (p_for_all k) gives a proof of p (succ k) = p n, as claimed. The recursion terminates, because the argument decreases with every call to p_for_all.

I dare you to tell me that isn't cool!!

In this sense, there isn't a fundamental difference between the "principle of mathematical induction" and the way addition and multiplication have been defined. In the algorithmic model, both reduce to explicit recursive computations. The only difference is that the functions return natural numbers in the latter case, but proofs in the former case.

Because mathematical induction is backed by executable code, it is not treated as an "axiom" in the Lean system, the same way as the existence of the successor function or addition isn't an axiom. (Though I guess it is an axiom of Lean's deductive system in the terminology of mathematical logic).

2.3. Type theory🔗

The computational interpretation of proofs is cool and all, but also has some demerits.

To build on a result, we clearly need to know whether a proposition has a proof, but the details of the proof are typically irrelevant.
The intuitionists were distrustful of deductive principles like the LEM. However, in practice, it makes some proofs simpler, and seems to never have caused serious trouble.
The same goes for non-logical non-constructive axioms, like the axiom of choice.

Proof assistants solve all these problems in one fell swoop. By using type theory.

2.3.1. Wait, what?

If, like me, you know the term only from the occasional use of low-level programming languages, this sounds like a bad joke.

Many languages require you to specify the type of a variable when declaring it. Like char for a character, or float for a floating point number. The compiler then knows how many bytes of memory to put aside for its values, and how to interpret the stored bit patterns. Against this background, the term "type theory" itself sounds overly pompous. "Float takes more space than char"--What else is there to say?

Now in computing, there is the well-known phenomenon of universality, which says that all systems of a certain complexity tend to be computationally equivalent. This has spawned an entire genre of nerdy humor: search for "accidentally Turing complete" on reddit if you don't care for your productivity. Like, there's people running first-person shooters inside of PDF documents, that kinda thing. So I do not doubt that one can misdesign a type system in such a horrible way that type judgments entail the Riemann hypothesis (C++ asks for its beer to be held). But of course, these stunts are funny precisely because it is a very bad idea to actually do that!!

Little did I know.

This, but unironically

2.3.2. Propositions as types

(If my initial confusion is any guide, this will take some time to get used to).

The first step is to associate with every mathematical proposition a type: The type of its proofs. (Right out of the gate, that's a ridiculously more fine-grained type system than the "int, char, float, I-guess-that's-it lol" I remembered about the C language, partly explaining my confusion).

In fact, Lean does not distinguish between a proposition and the type of its proofs. So 1+0=1 and 1+1=1 are types (remember that a proposition doesn't have to be true, just well-formed). Call a type inhabited if one can construct objects of that type, and empty else. Thus, in type-theory jargon, we don't say that "1+0=1 is provable", we say that "the type 1+0=1 is inhabited", and a proof would be a computation that produces an object of that type.

2.3.3. Static analysis

The reason for all these confusing re-definitions of existing terms becomes clear if one considers the most basic benefit of a type system: static analysis.

Even in plain old primitive C, if one writes int x = 3.4, the type checker will complain that one is trying to assign a floating point number to an integer. "Static" refers to the fact that this inconsistency is caught at compile time: There is no need to actually run the program. Call a program well-typed if it survives static type checks.

Now consider the induction example discussed above. You can re-read our analysis to conclude that the function p_for_all is well-typed (just replace the words "proof of" with "object of type").

No comes a very important point:

The code being well-typed means that if you have an object of type p 0 (i.e. a proof of the base case), and objects of type p k → p (k+1) (i.e. proofs of the induction hypothesis), then you know from the static analysis alone that if you ran p_for_all n, you'd get an object of type p n (i.e., for arbitrary n, a proof of the predicate). If that existential statement is enough for you (as is the case in classic mathematics), you don't actually have to ever invoke p_for_all, with its computationally expensive recursions!

This observation lies at the heart of a surprising and elegant dictionary linking propositions and types. It is known as the Curry-Howard correspondence.

2.3.4. Axioms

As a side effect, we can now incorporate non-constructive deduction rules and axioms.

Imagine you want to prove deep_theorem. Along the way, you need the statement p∨¬p for some auxiliary proposition p. Let's say you don't know how to prove either p or ¬p (maybe you're David Hilbert or something... or maybe it isn't even your fault, because p might be neither provable nor refutable in the logic you're using). But you're no Brouwer either, so you feel pretty confident that the LEM is a sound assumption to make, at least for p.

In this case, you can write the proof of deep_theorem as a function that expects an input hlem of type p∨¬p. If that function survives static type analysis, then you have proven deep_theorem according to the rules of classical logic!

Of course, what you don't have is a constructive proof: There's no way to execute the function, because it depends on an argument that you don't know how to construct.

In summary, "trust me, bro" cannot replace executable code, but it can work as a type judgement. And sometimes, you want the system to trust you.

2.3.5. Notes

We can now understand why the inefficient unary coding of natural numbers is of no big concern. If all we want to do is prove theorems, no program actually ever gets executed.