blog-static/content/blog/00_aoc_coq.md

16 KiB

title date tags series favorite
Advent of Code in Coq - Day 1 2020-12-02T18:44:56-08:00
Advent of Code
Coq
Advent of Code in Coq true

The first puzzle of this year's Advent of Code was quite simple, which gave me a thought: "Hey, this feels within reach for me to formally verify!" At first, I wanted to formalize and prove the correctness of the two-pointer solution. However, I didn't have the time to mess around with the various properties of sorted lists and their traversals. So, I settled for the brute force solution. Despite the simplicity of its implementation, there is plenty to talk about when proving its correctness using Coq. Let's get right into it!

Before we start, in the interest of keeping the post self-contained, here's the (paraphrased) problem statement:

Given an unsorted list of numbers, find two distinct numbers that add up to 2020.

With this in mind, we can move on to writing some Coq!

Defining the Functions

The first step to proving our code correct is to actually write the code! To start with, let's write a helper function that, given a number x, tries to find another number y such that x + y = 2020. In fact, rather than hardcoding the desired sum to 2020, let's just use another argument called total. The code is quite simple:

{{< codelines "Coq" "aoc-2020/day1.v" 11 18 >}}

Here, is is the list of numbers that we want to search. We proceed by case analysis: if the list is empty, we can't find a match, so we return None (the Coq equivalent of Haskell's Nothing). On the other hand, if the list has at least one element y, we see if it adds up to total, and return Some y (equivalent to Just y in Haskell) if it does. If it doesn't, we continue our search into the rest of the list.

It's somewhat unusual, in my experience, to put the list argument first when writing functions in a language with currying. However, it seems as though Coq's simpl tactic, which we will use later, works better for our purposes when the argument being case analyzed is given first.

We can now use find_matching to define our find_sum function, which solves part 1. Here's the code:

{{< codelines "Coq" "aoc-2020/day1.v" 20 28 >}}

For every x that we encounter in our input list is, we want to check if there's a matching number in the rest of the list. We only search the remainder of the list because we can't use x twice: the x and y we return that add up to total must be different elements. We use find_matching to try find a complementary number for x. If we don't find it, this x isn't it, so we recursively move on to xs. On the other hand, if we do find a matching y, we're done! We return (x,y), wrapped in Some to indicate that we found something useful.

What about that (* Was buggy! *) line? Well, it so happens that my initial implementation had a bug on this line, one that came up as I was proving the correctness of my function. When I wasn't able to prove a particular behavior in one of the cases, I realized something was wrong. In short, my proof actually helped me find and fix a bug!

This is all the code we'll need to get our solution. Next, let's talk about some properties of our two functions.

Our First Lemma

When we call find_matching, we want to be sure that if we get a number, it does indeed add up to our expected total. We can state it a little bit more formally as follows:

For any numbers k and x, and for any list of number is, if find_matching is k x returns a number y, then x + y = k.

And this is how we write it in Coq:

{{< codelines "Coq" "aoc-2020/day1.v" 30 31 >}}

The arrow, ->, reads "implies". Other than that, I think this property reads pretty well. The proof, unfortunately, is a little bit more involved. Here are the first few lines:

{{< codelines "Coq" "aoc-2020/day1.v" 32 35 >}}

We start with the intros is tactic, which is akin to saying "consider a particular list of integers is". We do this without losing generality: by simply examining a concrete list, we've said nothing about what that list is like. We then proceed by induction on is.

To prove something by induction for a list, we need to prove two things:

  • The base case. Whatever property we want to hold, it must hold for the empty list, which is the simplest possible list. In our case, this means find_matching searching an empty list.
  • The inductive case. Assuming that a property holds for any list [b, c, ...], we want to show that the property also holds for the list [a, b, c, ...]. That is, the property must remain true if we prepend an element to a list for which this property holds.

These two things combined give us a proof for all lists, which is exactly what we want! If you don't belive me, here's how it works. Suppose you want to prove that some property P holds for [1,2,3,4]. Given the base case, we know that P [] holds. Next, by the inductive case, since P [] holds, we can prepend 4 to the list, and the property will still hold. Thus, P [4]. Now that P [4] holds, we can again prepend an element to the list, this time a 3, and conclude that P [3,4]. Repeating this twice more, we arrive at our desired fact: P [1,2,3,4].

When we write induction is, Coq will generate two proof goals for us, one for the base case, and one for the inductive case. We will have to prove each of them separately. Since we have not yet introduced the variables k, x, and y, they remain inside a forall quantifier at that time. To be able to refer to them, we want to use intros. We want to do this in both the base and the inductive case. To quickly do this, we use Coq's ; operator. When we write a; b, Coq runs the tactic a, and then runs the tactic b in every proof goal generated by a. This is exactly what we want.

There's one more variable inside our second intros: Hev. This variable refers to the hypothesis of our statement: that is, the part on the left of the ->. To prove that A implies B, we assume that A holds, and try to argue B from there. Here is no different: when we use intros Hev, we say, "suppose that you have a proof that find_matching evaluates to Some y, called Hev". The thing on the right of -> becomes our proof goal.

Now, it's time to look at the cases. To focus on one case at a time, we use -. The first case is our base case. Here's what Coq prints out at this time:

k, x, y : nat
Hev : find_matching nil k x = Some y

========================= (1 / 1)

x + y = k

All the stuff above the === line are our hypotheses. We know that we have some k, x, and y, all of which are numbers. We also have the assumption that find_matching returns Some y. In the base case, is is just [], and this is reflected in the type for Hev. To make this more clear, we'll simplify the call to find_matching in Hev, using simpl in Hev. Now, here's what Coq has to say about Hev:

Hev : None = Some y

Well, this doesn't make any sense. How can something be equal to nothing? We ask Coq this question using inversion Hev. Effectively, the question that inversion asks is: what are the possible ways we could have acquired Hev? Coq generates a proof goal for each of these possible ways. Alas, there are no ways to arrive at this contradictory assumption: the number of proof sub-goals is zero. This means we're done with the base case!

The inductive case is the meat of this proof. Here's the corresponding part of the Coq source file:

{{< codelines "Coq" "aoc-2020/day1.v" 36 40 >}}

This time, the proof state is more complicated:

a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Hev : find_matching (a :: is) k x = Some y

========================= (1 / 1)

x + y = k

Following the footsteps of our informal description of the inductive case, Coq has us prove our property for (a :: is), or the list is to which a is being prepended. Like before, we assume that our property holds for is. This is represented in the induction hypothesis IHis. It states that if find_matching finds a y in is, it must add up to k. However, IHis doesn't tell us anything about a :: is: that's our job. We also still have Hev, which is our assumption that find_matching finds a y in (a :: is). Running simpl in Hev gives us the following:

Hev : (if x + a =? k then Some a else find_matching is k x) = Some y

The result of find_matching now depends on whether or not the new element a adds up to k. If it does, then find_matching will return a, which means that y is the same as a. If not, it must be that find_matching finds the y in the rest of the list, is. We're not sure which of the possibilities is the case. Fortunately, we don't need to be! If we can prove that the y that find_matching finds is correct regardless of whether a adds up to k or not, we're good to go! To do this, we perform case analysis using destruct.

Our particular use of destruct says: check any possible value for x + a ?= k, and create an equation Heq that tells us what that value is. ?= returns a boolean value, and so destruct generates two new goals: one where the function returns true, and one where it returns false. We start with the former. Here's the proof state:

a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + a =? k) = true
Hev : Some a = Some y

========================= (1 / 1)

x + y = k

There is a new hypothesis: Heq. It tells us that we're currently considering the case where ?= evaluates to true. Also, Hev has been considerably simplified: now that we know the condition of the if expression, we can just replace it with the then branch.

Looking at Hev, we can see that our prediction was right: a is equal to y. After all, if they weren't, Some a wouldn't equal to Some y. To make Coq take this information into account, we use injection. This will create a new hypothesis, a = y. But if one is equal to the other, why don't we just use only one of these variables everywhere? We do exactly that by using subst, which replaces a with y everywhere in our proof.

The proof state is now:

is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + y =? k) = true

========================= (1 / 1)

x + y = k

We're close, but there's one more detail to keep in mind. Our goal, x + y = k, is the proposition that x + y is equal to k. However, Heq tells us that the function ?= evaluates to true. These are fundamentally different. One talks about mathematical equality, while the other about some function ?= defined somewhere in Coq's standard library. Who knows - maybe there's a bug in Coq's implementation! Fortunately, Coq comes with a proof that if two numbers are equal according to ?=, they are mathematically equal. This proof is called eqb_nat_eq. We tell Coq to use this with apply. Our proof goal changes to:

true = (x + y =? k)

This is almost like Heq, but flipped. Instead of manually flipping it and using apply with Heq, I let Coq do the rest of the work using auto.

Phew! All this for the true case of ?=. Next, what happens if x + a does not equal k? Here's the proof state at this time:

a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + a =? k) = false
Hev : find_matching is k x = Some y

========================= (1 / 1)

x + y = k

Since a was not what it was looking for, find_matching moved on to is. But hey, we're in the inductive case! We are assuming that find_matching will work properly with the list is. Since find_matching found its y in is, this should be all we need! We use our induction hypothesis IHis with apply. IHis itself does not know that find_matching moved on to is, so it asks us to prove it. Fortunately, Hev tells us exactly that, so we use assumption, and the proof is complete! Quod erat demonstrandum, QED!

The Rest of the Owl

Here are a couple of other properties of find_matching. For brevity's sake, I will not go through their proofs step-by-step. I find that the best way to understand Coq proofs is to actually step through them in the IDE!

First on the list is find_matching_skip. Here's the type:

{{< codelines "Coq" "aoc-2020/day1.v" 42 43 >}}

It reads: if we correctly find a number in a small list is, we can find that same number even if another number is prepended to is. That makes sense: adding a number to a list doesn't remove whatever we found in it! I used this lemma to prove another, find_matching_works:

{{< codelines "Coq" "aoc-2020/day1.v" 53 54 >}}

This reads, if there is an element y in is that adds up to k with x, then find_matching will find it. This is an important property. After all, if it didn't hold, it would mean that find_matching would occasionally fail to find a matching number, even though it's there! We can't have that.

Finally, we want to specify what it means for find_sum, our solution function, to actually work. The naive definition would be:

Given a list of integers, find_sum always finds a pair of numbers that add up to k.

Unfortunately, this is not true. What if, for instance, we give find_sum an empty list? There are no numbers from that list to find and add together. Even a non-empty list may not include such a pair! We need a way to characterize valid input lists. I claim that all lists from this Advent of Code puzzle are guaranteed to have two numbers that add up to our goal, and that these numbers are not equal to each other. In Coq, we state this as follows:

{{< codelines "Coq" "aoc-2020/day1.v" 8 9 >}}

This defines a new property, has_pair t is (read "is has a pair of numbers that add to t"), which means:

There are two numbers n1 and n2 such that, they are not equal to each other (n1<>n2) and the number n1 is an element of is (In n1 is) and the number n2 is an element of is (In n2 is) and the two numbers add up to t (n1 + n2 = t).

When making claims about the correctness of our algorithm, we will assume that this property holds. Finally, here's the theorem we want to prove:

{{< codelines "Coq" "aoc-2020/day1.v" 68 70 >}}

It reads, "for any total k and list is, if is has a pair of numbers that add to k, then find_sum will return a pair of numbers x and y that add to k". There's some nuance here. We hardly reference the has_pair property in this definition, and for good reason. Our has_pair hypothesis only says that there is at least one pair of numbers in is that meets our criteria. However, this pair need not be the only one, nor does it need to be the one returned by find_sum! However, if we have many pairs, we want to confirm that find_sum will find one of them. Finally, here is the proof. I will not be able to go through it in detail in this post, but I did comment it to make it easier to read:

{{< codelines "Coq" "aoc-2020/day1.v" 71 106 >}}

Coq seems happy with it, and so am I! The bug I mentioned earlier popped up on line 96. I had accidentally made find_sum return None if it couldn't find a complement for the x it encountered. This meant that it never recursed into the remaining list xs, and thus, the pair was never found at all! It this became impossible to prove that find_some will return Some y, and I had to double back and check my definitions.

I hope you enjoyed this post! If you're interested to learn more about Coq, I strongly recommend checking out Software Foundations, a series of books on Coq written as comments in a Coq source file! In particular, check out Logical Foundations for an introduction to using Coq. Thanks for reading!