blog-static/content/blog/00_aoc_coq.md

352 lines
16 KiB
Markdown
Raw Normal View History

2020-12-02 01:14:32 -08:00
---
title: "Advent of Code in Coq - Day 1"
2020-12-02 18:45:28 -08:00
date: 2020-12-02T18:44:56-08:00
2020-12-02 01:14:32 -08:00
tags: ["Advent of Code", "Coq"]
2020-12-30 14:27:22 -08:00
favorite: true
2020-12-02 01:14:32 -08:00
---
The first puzzle of this year's [Advent of Code](https://adventofcode.com) was quite
simple, which gave me a thought: "Hey, this feels within reach for me to formally verify!"
At first, I wanted to formalize and prove the correctness of the [two-pointer solution](https://www.geeksforgeeks.org/two-pointers-technique/).
However, I didn't have the time to mess around with the various properties of sorted
lists and their traversals. So, I settled for the brute force solution. Despite
the simplicity of its implementation, there is plenty to talk about when proving
its correctness using Coq. Let's get right into it!
Before we start, in the interest of keeping the post self-contained, here's the (paraphrased)
problem statement:
> Given an unsorted list of numbers, find two distinct numbers that add up to 2020.
With this in mind, we can move on to writing some Coq!
### Defining the Functions
2020-12-02 18:45:28 -08:00
The first step to proving our code correct is to actually write the code! To start with,
let's write a helper function that, given a number `x`, tries to find another number
`y` such that `x + y = 2020`. In fact, rather than hardcoding the desired
sum to `2020`, let's just use another argument called `total`. The code is quite simple:
2020-12-02 01:14:32 -08:00
{{< codelines "Coq" "aoc-2020/day1.v" 11 18 >}}
2020-12-02 01:14:32 -08:00
2020-12-02 18:45:28 -08:00
Here, `is` is the list of numbers that we want to search.
We proceed by case analysis: if the list is empty, we can't
2020-12-02 01:14:32 -08:00
find a match, so we return `None` (the Coq equivalent of Haskell's `Nothing`).
2020-12-02 18:45:28 -08:00
On the other hand, if the list has at least one element `y`, we see if it adds
up to `total`, and return `Some y` (equivalent to `Just y` in Haskell) if it does.
If it doesn't, we continue our search into the rest of the list.
2020-12-02 01:14:32 -08:00
It's somewhat unusual, in my experience, to put the list argument first when writing
functions in a language with [currying](https://wiki.haskell.org/Currying). However,
it seems as though Coq's `simpl` tactic, which we will use later, works better
for our purposes when the argument being case analyzed is given first.
We can now use `find_matching` to define our `find_sum` function, which solves part 1.
Here's the code:
{{< codelines "Coq" "aoc-2020/day1.v" 20 28 >}}
2020-12-02 01:14:32 -08:00
For every `x` that we encounter in our input list `is`, we want to check if there's
a matching number in the rest of the list. We only search the remainder of the list
because we can't use `x` twice: the `x` and `y` we return that add up to `total`
must be different elements. We use `find_matching` to try find a complementary number
for `x`. If we don't find it, this `x` isn't it, so we recursively move on to `xs`.
On the other hand, if we _do_ find a matching `y`, we're done! We return `(x,y)`,
wrapped in `Some` to indicate that we found something useful.
What about that `(* Was buggy! *)` line? Well, it so happens that my initial
implementation had a bug on this line, one that came up as I was proving
the correctness of my function. When I wasn't able to prove a particular
behavior in one of the cases, I realized something was wrong. In short,
my proof actually helped me find and fix a bug!
This is all the code we'll need to get our solution. Next, let's talk about some
properties of our two functions.
### Our First Lemma
When we call `find_matching`, we want to be sure that if we get a number,
it does indeed add up to our expected total. We can state it a little bit more
formally as follows:
> For any numbers `k` and `x`, and for any list of number `is`,
> if `find_matching is k x` returns a number `y`, then `x + y = k`.
And this is how we write it in Coq:
{{< codelines "Coq" "aoc-2020/day1.v" 30 31 >}}
2020-12-02 01:14:32 -08:00
The arrow, `->`, reads "implies". Other than that, I think this
property reads pretty well. The proof, unfortunately, is a little bit more involved.
Here are the first few lines:
{{< codelines "Coq" "aoc-2020/day1.v" 32 35 >}}
2020-12-02 01:14:32 -08:00
We start with the `intros is` tactic, which is akin to saying
"consider a particular list of integers `is`". We do this without losing
generality: by simply examining a concrete list, we've said nothing about
what that list is like. We then proceed by induction on `is`.
To prove something by induction for a list, we need to prove two things:
* The __base case__. Whatever property we want to hold, it must
hold for the empty list, which is the simplest possible list.
In our case, this means `find_matching` searching an empty list.
* The __inductive case__. Assuming that a property holds for any list
`[b, c, ...]`, we want to show that the property also holds for
2020-12-02 18:45:28 -08:00
the list `[a, b, c, ...]`. That is, the property must remain true if we
2020-12-02 01:14:32 -08:00
prepend an element to a list for which this property holds.
These two things combined give us a proof for _all_ lists, which is exactly
what we want! If you don't belive me, here's how it works. Suppose you want
to prove that some property `P` holds for `[1,2,3,4]`. Given the base
case, we know that `P []` holds. Next, by the inductive case, since
`P []` holds, we can prepend `4` to the list, and the property will
still hold. Thus, `P [4]`. Now that `P [4]` holds, we can again prepend
an element to the list, this time a `3`, and conclude that `P [3,4]`.
Repeating this twice more, we arrive at our desired fact: `P [1,2,3,4]`.
When we write `induction is`, Coq will generate two proof goals for us,
2020-12-02 18:45:28 -08:00
one for the base case, and one for the inductive case. We will have to prove
each of them separately. Since we have
2020-12-02 01:14:32 -08:00
not yet introduced the variables `k`, `x`, and `y`, they remain
inside a `forall` quantifier at that time. To be able to refer
to them, we want to use `intros`. We want to do this in both the
base and the inductive case. To quickly do this, we use Coq's `;`
operator. When we write `a; b`, Coq runs the tactic `a`, and then
runs the tactic `b` in every proof goal generated by `a`. This is
exactly what we want.
There's one more variable inside our second `intros`: `Hev`.
This variable refers to the hypothesis of our statement:
that is, the part on the left of the `->`. To prove that `A`
implies `B`, we assume that `A` holds, and try to argue `B` from there.
Here is no different: when we use `intros Hev`, we say, "suppose that you have
a proof that `find_matching` evaluates to `Some y`, called `Hev`". The thing
on the right of `->` becomes our proof goal.
Now, it's time to look at the cases. To focus on one case at a time,
we use `-`. The first case is our base case. Here's what Coq prints
out at this time:
```
k, x, y : nat
Hev : find_matching nil k x = Some y
========================= (1 / 1)
x + y = k
```
All the stuff above the `===` line are our hypotheses. We know
that we have some `k`, `x`, and `y`, all of which are numbers.
We also have the assumption that `find_matching` returns `Some y`.
In the base case, `is` is just `[]`, and this is reflected in the
type for `Hev`. To make this more clear, we'll simplify the call to `find_matching`
in `Hev`, using `simpl in Hev`. Now, here's what Coq has to say about `Hev`:
```
Hev : None = Some y
```
Well, this doesn't make any sense. How can something be equal to nothing?
2020-12-02 18:45:28 -08:00
We ask Coq this question using `inversion Hev`. Effectively, the question
that `inversion` asks is: what are the possible ways we could have acquired `Hev`?
2020-12-02 01:14:32 -08:00
Coq generates a proof goal for each of these possible ways. Alas, there are
no ways to arrive at this contradictory assumption: the number of proof sub-goals
is zero. This means we're done with the base case!
The inductive case is the meat of this proof. Here's the corresponding part
2020-12-02 18:45:28 -08:00
of the Coq source file:
2020-12-02 01:14:32 -08:00
{{< codelines "Coq" "aoc-2020/day1.v" 36 40 >}}
2020-12-02 01:14:32 -08:00
This time, the proof state is more complicated:
```
a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Hev : find_matching (a :: is) k x = Some y
========================= (1 / 1)
x + y = k
```
Following the footsteps of our informal description of the inductive case,
Coq has us prove our property for `(a :: is)`, or the list `is` to which
`a` is being prepended. Like before, we assume that our property holds for `is`.
2020-12-02 18:45:28 -08:00
This is represented in the __induction hypothesis__ `IHis`. It states that if
2020-12-02 01:14:32 -08:00
`find_matching` finds a `y` in `is`, it must add up to `k`. However, `IHis`
doesn't tell us anything about `a :: is`: that's our job. We also still have
`Hev`, which is our assumption that `find_matching` finds a `y` in `(a :: is)`.
Running `simpl in Hev` gives us the following:
```
Hev : (if x + a =? k then Some a else find_matching is k x) = Some y
```
The result of `find_matching` now depends on whether or not the new element `a`
2020-12-02 18:45:28 -08:00
adds up to `k`. If it does, then `find_matching` will return `a`, which means
that `y` is the same as `a`. If not, it must be that `find_matching` finds
the `y` in the rest of the list, `is`. We're not sure which of the possibilities
is the case. Fortunately, we don't need to be!
2020-12-02 01:14:32 -08:00
If we can prove that the `y` that `find_matching` finds is correct regardless
of whether `a` adds up to `k` or not, we're good to go! To do this,
we perform case analysis using `destruct`.
2020-12-02 18:45:28 -08:00
Our particular use of `destruct` says: check any possible value for `x + a ?= k`,
and create an equation `Heq` that tells us what that value is. `?=` returns a boolean
2020-12-02 01:14:32 -08:00
value, and so `destruct` generates two new goals: one where the function returns `true`,
and one where it returns `false`. We start with the former. Here's the proof state:
```
a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + a =? k) = true
Hev : Some a = Some y
========================= (1 / 1)
x + y = k
```
There is a new hypothesis: `Heq`. It tells us that we're currently
2020-12-02 18:45:28 -08:00
considering the case where `?=` evaluates to `true`. Also,
2020-12-02 01:14:32 -08:00
`Hev` has been considerably simplified: now that we know the condition
of the `if` expression, we can just replace it with the `then` branch.
2020-12-02 18:45:28 -08:00
Looking at `Hev`, we can see that our prediction was right: `a` is equal to `y`. After all,
2020-12-02 01:14:32 -08:00
if they weren't, `Some a` wouldn't equal to `Some y`. To make Coq
take this information into account, we use `injection`. This will create
a new hypothesis, `a = y`. But if one is equal to the other, why don't we
just use only one of these variables everywhere? We do exactly that by using
`subst`, which replaces `a` with `y` everywhere in our proof.
The proof state is now:
```
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + y =? k) = true
========================= (1 / 1)
x + y = k
```
We're close, but there's one more detail to keep in mind. Our goal, `x + y = k`,
is the __proposition__ that `x + y` is equal to `k`. However, `Heq` tells us
2020-12-02 18:45:28 -08:00
that the __function__ `?=` evaluates to `true`. These are fundamentally different.
One talks about mathematical equality, while the other about some function `?=`
2020-12-02 01:14:32 -08:00
defined somewhere in Coq's standard library. Who knows - maybe there's a bug in
2020-12-02 18:45:28 -08:00
Coq's implementation! Fortunately, Coq comes with a proof that if two numbers
are equal according to `?=`, they are mathematically equal. This proof is
2020-12-02 01:14:32 -08:00
called `eqb_nat_eq`. We tell Coq to use this with `apply`. Our proof goal changes to:
```
true = (x + y =? k)
```
This is _almost_ like `Heq`, but flipped. Instead of manually flipping it and using `apply`
2020-12-02 18:45:28 -08:00
with `Heq`, I let Coq do the rest of the work using `auto`.
2020-12-02 01:14:32 -08:00
2020-12-02 18:45:28 -08:00
Phew! All this for the `true` case of `?=`. Next, what happens if `x + a` does not equal `k`?
2020-12-02 01:14:32 -08:00
Here's the proof state at this time:
```
a : nat
is : list nat
IHis : forall k x y : nat, find_matching is k x = Some y -> x + y = k
k, x, y : nat
Heq : (x + a =? k) = false
Hev : find_matching is k x = Some y
========================= (1 / 1)
x + y = k
```
Since `a` was not what it was looking for, `find_matching` moved on to `is`. But hey,
we're in the inductive case! We are assuming that `find_matching` will work properly
with the list `is`. Since `find_matching` found its `y` in `is`, this should be all we need!
We use our induction hypothesis `IHis` with `apply`. `IHis` itself does not know that
`find_matching` moved on to `is`, so it asks us to prove it. Fortunately, `Hev` tells us
exactly that, so we use `assumption`, and the proof is complete! Quod erat demonstrandum, QED!
### The Rest of the Owl
Here are a couple of other properties of `find_matching`. For brevity's sake, I will
not go through their proofs step-by-step. I find that the best way to understand
Coq proofs is to actually step through them in the IDE!
First on the list is `find_matching_skip`. Here's the type:
{{< codelines "Coq" "aoc-2020/day1.v" 42 43 >}}
2020-12-02 01:14:32 -08:00
It reads: if we correctly find a number in a small list `is`, we can find that same number
even if another number is prepended to `is`. That makes sense: _adding_ a number to
a list doesn't remove whatever we found in it! I used this lemma to prove another,
`find_matching_works`:
{{< codelines "Coq" "aoc-2020/day1.v" 53 54 >}}
2020-12-02 01:14:32 -08:00
This reads, if there _is_ an element `y` in `is` that adds up to `k` with `x`, then
`find_matching` will find it. This is an important property. After all, if it didn't
hold, it would mean that `find_matching` would occasionally fail to find a matching
number, even though it's there! We can't have that.
2020-12-02 18:45:28 -08:00
Finally, we want to specify what it means for `find_sum`, our solution function, to actually
2020-12-02 01:14:32 -08:00
work. The naive definition would be:
> Given a list of integers, `find_sum` always finds a pair of numbers that add up to `k`.
2020-12-02 18:45:28 -08:00
Unfortunately, this is not true. What if, for instance, we give `find_sum` an empty list?
2020-12-02 01:14:32 -08:00
There are no numbers from that list to find and add together. Even a non-empty list
may not include such a pair! We need a way to characterize valid input lists. I claim
that all lists from this Advent of Code puzzle are guaranteed to have two numbers that
add up to our goal, and that these numbers are not equal to each other. In Coq,
we state this as follows:
{{< codelines "Coq" "aoc-2020/day1.v" 8 9 >}}
2020-12-02 01:14:32 -08:00
This defines a new property, `has_pair t is` (read "`is` has a pair of numbers that add to `t`"),
which means:
2020-12-02 18:45:28 -08:00
> There are two numbers `n1` and `n2` such that, they are not equal to each other (`n1<>n2`) __and__
> the number `n1` is an element of `is` (`In n1 is`) __and__
> the number `n2` is an element of `is` (`In n2 is`) __and__
> the two numbers add up to `t` (`n1 + n2 = t`).
2020-12-02 01:14:32 -08:00
When making claims about the correctness of our algorithm, we will assume that this
property holds. Finally, here's the theorem we want to prove:
{{< codelines "Coq" "aoc-2020/day1.v" 68 70 >}}
2020-12-02 01:14:32 -08:00
It reads, "for any total `k` and list `is`, if `is` has a pair of numbers that add to `k`,
then `find_sum` will return a pair of numbers `x` and `y` that add to `k`".
There's some nuance here. We hardly reference the `has_pair` property in this definition,
and for good reason. Our `has_pair` hypothesis only says that there is _at least one_
pair of numbers in `is` that meets our criteria. However, this pair need not be the only
one, nor does it need to be the one returned by `find_sum`! However, if we have many pairs,
we want to confirm that `find_sum` will find one of them. Finally, here is the proof.
I will not be able to go through it in detail in this post, but I did comment it to
make it easier to read:
{{< codelines "Coq" "aoc-2020/day1.v" 71 106 >}}
2020-12-02 01:14:32 -08:00
Coq seems happy with it, and so am I! The bug I mentioned earlier popped up on line 96.
I had accidentally made `find_sum` return `None` if it couldn't find a complement
for the `x` it encountered. This meant that it never recursed into the remaining
list `xs`, and thus, the pair was never found at all! It this became impossible
to prove that `find_some` will return `Some y`, and I had to double back
and check my definitions.
I hope you enjoyed this post! If you're interested to learn more about Coq, I strongly recommend
checking out [Software Foundations](https://softwarefoundations.cis.upenn.edu/), a series
of books on Coq written as comments in a Coq source file! In particular, check out
[Logical Foundations](https://softwarefoundations.cis.upenn.edu/lf-current/index.html)
for an introduction to using Coq. Thanks for reading!