874 lines
41 KiB
Markdown
874 lines
41 KiB
Markdown
---
|
|
title: "Advent of Code in Coq - Day 8"
|
|
date: 2020-12-31T17:55:25-08:00
|
|
tags: ["Advent of Code", "Coq"]
|
|
draft: true
|
|
---
|
|
|
|
Huh? We're on day 8? What happened to days 2 through 7?
|
|
|
|
Well, for the most part, I didn't think they were that interesting from the Coq point of view.
|
|
Day 7 got close, but not close enough to inspire me to create a formalization. Day 8, on the other
|
|
hand, is
|
|
{{< sidenote "right" "pl-note" "quite interesting," >}}
|
|
Especially to someone like me who's interested in programming languages!
|
|
{{< /sidenote >}} and took quite some time to formalize.
|
|
|
|
As before, here's an (abridged) description of the problem:
|
|
|
|
> Given a tiny assembly-like language, determine the state of its accumulator
|
|
> when the same instruction is executed twice.
|
|
|
|
Before we start on the Coq formalization, let's talk about an idea from
|
|
Programming Language Theory (PLT), _big step operational semantics_.
|
|
|
|
### Big Step Operational Semantics
|
|
What we have in Advent of Code's Day 8 is, undeniably, a small programming language.
|
|
We are tasked with executing this language, or, in PLT lingo, defining its _semantics_.
|
|
There are many ways of doing this - at university, I've been taught of [denotational](https://en.wikipedia.org/wiki/Denotational_semantics), [axiomatic](https://en.wikipedia.org/wiki/Axiomatic_semantics),
|
|
and [operational](https://en.wikipedia.org/wiki/Operational_semantics) semantics.
|
|
I believe that Coq's mechanism of inductive definitions lends itself very well
|
|
to operational semantics, so we'll take that route. But even "operational semantics"
|
|
doesn't refer to a concrete technique - we have a choice between small-step (structural) and
|
|
big-step (natural) operational semantics. The former describe the minimal "steps" a program
|
|
takes as it's being evaluated, while the latter define the final results of evaluating a program.
|
|
I decided to go with big-step operational semantics, since they're more intutive (natural!).
|
|
|
|
So, how does one go about "[defining] the final results of evaluating a program?" Most commonly,
|
|
we go about using _inference rules_. Let's talk about those next.
|
|
|
|
#### Inference Rules
|
|
Inference rules are a very general notion. The describe how we can determine (infer) a conclusion
|
|
from a set of assumption. It helps to look at an example. Here's a silly little inference rule:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{\text{I'm allergic to cats} \quad \text{My friend has a cat}}
|
|
{\text{I will not visit my friend very much}}
|
|
{{< /latex >}}
|
|
|
|
It reads, "if I'm allergic to cats, and if my friend has a cat, then I will not visit my friend very much".
|
|
Here, "I'm allergic to cats" and "my friend has a cat" are _premises_, and "I will not visit my friend very much" is
|
|
a _conclusion_. An inference rule states that if all its premises are true, then its conclusion must be true.
|
|
Here's another inference rule, this time with some mathematical notation instead of words:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{n < m}
|
|
{n + 1 < m + 1}
|
|
{{< /latex >}}
|
|
|
|
This one reads, "if \\(n\\) is less than \\(m\\), then \\(n+1\\) is less than \\(m+1\\)". We can use inference
|
|
rules to define various constructs. As an example, let's define what it means for a natural number to be even.
|
|
It takes two rules:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{}
|
|
{0 \; \text{is even}}
|
|
\quad
|
|
\frac
|
|
{n \; \text{is even}}
|
|
{n+2 \; \text{is even}}
|
|
{{< /latex >}}
|
|
|
|
First of all, zero is even. We take this as fact - there are no premises for the first rule, so they
|
|
are all trivially true. Next, if a number is even, then adding 2 to that number results in another
|
|
even number. Using the two of these rules together, we can correctly determine whether any number
|
|
is or isn't even. We start knowing that 0 is even. Adding 2 we learn that 2 is even, and adding 2
|
|
again we see that 4 is even, as well. We can continue this to determine that 6, 8, 10, and so on
|
|
are even too. Never in this process will we visit the numbers 1 or 3 or 5, and that's good - they're not even!
|
|
|
|
Let's now extend this notion to programming languages, starting with a simple arithmetic language.
|
|
This language is made up of natural numbers and the \\(\square\\) operation, which represents the addition
|
|
of two numbers. Again, we need two rules:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{n \in \mathbb{N}}
|
|
{n \; \text{evaluates to} \; n}
|
|
\quad
|
|
\frac
|
|
{e_1 \; \text{evaluates to} \; n_1 \quad e_2 \; \text{evaluates to} \; n_2}
|
|
{e_1 \square e_2 \; \text{evaluates to} \; n_1 + n_2}
|
|
{{< /latex >}}
|
|
|
|
First, let me explain myself. I used \\(\square\\) to demonstrate two important points. First, languages can be made of
|
|
any kind of characters we want; it's the rules that we define that give these languages meaning.
|
|
Second, while \\(\square\\) is the addition operation _in our language_, \\(+\\) is the _mathematical addition operator_.
|
|
They are not the same - we use the latter to define how the former works.
|
|
|
|
Finally, writing "evaluates to" gets quite tedious, especially for complex languages. Instead,
|
|
PLT people use notation to make their semantics more concise. The symbol \\(\Downarrow\\) is commonly
|
|
used to mean "evaluates to"; thus, \\(e \Downarrow v\\) reads "the expression \\(e\\) evaluates to the value \\(v\\).
|
|
Using this notation, our rules start to look like the following:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{n \in \mathbb{N}}
|
|
{n \Downarrow n}
|
|
\quad
|
|
\frac
|
|
{e_1 \Downarrow n_1 \quad e_2 \Downarrow n_2}
|
|
{e_1 \square e_2 \Downarrow n_1 + n_2}
|
|
{{< /latex >}}
|
|
|
|
If nothing else, these are way more compact! Though these may look intimidating at first, it helps to
|
|
simply read each symbol as its English meaning.
|
|
|
|
#### Encoding Inference Rules in Coq
|
|
Now that we've seen what inference rules are, we can take a look at how they can be represented in Coq.
|
|
We can use Coq's `Inductive` mechanism to define the rules. Let's start with our "is even" property.
|
|
|
|
```Coq
|
|
Inductive is_even : nat -> Prop :=
|
|
| zero_even : is_even 0
|
|
| plustwo_even : is_even n -> is_even (n+2).
|
|
```
|
|
|
|
The first line declares the property `is_even`, which, given a natural number, returns proposition.
|
|
This means that `is_even` is not a proposition itself, but `is_even 0`, `is_even 1`, and `is_even 2`
|
|
are all propositions.
|
|
|
|
The following two lines each encode one of our aforementioned inference rules. The first rule, `zero_even`,
|
|
is of type `is_even 0`. The `zero_even` rule doesn't require any arguments, and we can use it to create
|
|
a proof that 0 is even. On the other hand, the `plustwo_even` rule _does_ require an argument, `is_even n`.
|
|
To construct a proof that a number `n+2` is even using `plustwo_even`, we need to provide a proof
|
|
that `n` itself is even. From this definition we can see a general principle: we encode each inference
|
|
rule as constructor of an inductive Coq type. Each rule encoded in this manner takes as arguments
|
|
the proofs of its premises, and returns a proof of its conclusion.
|
|
|
|
For another example, let's encode our simple addition language. First, we have to define the language
|
|
itself:
|
|
|
|
```Coq
|
|
Inductive tinylang : Type :=
|
|
| number (n : nat) : tinylang
|
|
| box (e1 e2 : tinylang) : tinylang.
|
|
```
|
|
|
|
This defines the two elements of our example language: `number n` corresponds to \\(n\\), and `box e1 e2` corresponds
|
|
to \\(e_1 \square e_2\\). Finally, we define the inference rules:
|
|
|
|
```Coq {linenos=true}
|
|
Inductive tinylang_sem : tinylang -> nat -> Prop :=
|
|
| number_sem : forall (n : nat), tinylang_sem (number n) n
|
|
| box_sem : forall (e1 e2 : tinylang) (n1 n2 : nat),
|
|
tinylang_sem e1 n1 -> tinylang_sem e2 n2 ->
|
|
tinylang_sem (box e1 e2) (n1 + n2).
|
|
```
|
|
|
|
When we wrote our rules earlier, by using arbitrary variables like \\(e_1\\) and \\(n_1\\), we implicitly meant
|
|
that our rules work for _any_ number or expression. When writing Coq we have to make this assumption explicit
|
|
by using `forall`. For instance, the rule on line 2 reads, "for any number `n`, the expression `n` evaluates to `n`".
|
|
|
|
#### Semantics of Our Language
|
|
|
|
We've now written some example big-step operational semantics, both "on paper" and in Coq. Now, it's time to take a look at
|
|
the specific semantics of the language from Day 8! Our language consists of a few parts.
|
|
|
|
First, there are three opcodes: \\(\texttt{jmp}\\), \\(\\texttt{nop}\\), and \\(\\texttt{add}\\). Opcodes, combined
|
|
with an integer, make up an instruction. For example, the instruction \\(\\texttt{add} \\; 3\\) will increase the
|
|
content of the accumulator by three. Finally, a program consists of a sequence of instructions; They're separated
|
|
by newlines in the puzzle input, but we'll instead separate them by semicolons. For example, here's a complete program.
|
|
|
|
{{< latex >}}
|
|
\texttt{add} \; 0; \; \texttt{nop} \; 2; \; \texttt{jmp} \; -2
|
|
{{< /latex >}}
|
|
|
|
Now, let's try evaluating this program. Starting at the beginning and with 0 in the accumulator,
|
|
it will add 0 to the accumulator (keeping it the same),
|
|
do nothing, and finally jump back to the beginning. At this point, it will try to run the addition instruction again,
|
|
which is not allowed; thus, the program will terminate.
|
|
|
|
Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote \\(p\\)).
|
|
* First, to evaluate the program we will need a program counter, \\(\\textit{c}\\). This program counter
|
|
will tell us the position of the instruction to be executed next. It can also point past the last instruction,
|
|
which means our program terminated successfully.
|
|
* Next, we'll need the accumulator \\(a\\). Addition instructions can change the accumulator, and we will be interested
|
|
in the number that ends up in the accumulator when our program finishes executing.
|
|
* Finally, and more subtly, we'll need to keep track of the states we visited. For instance,
|
|
in the course of evaluating our program above, we encounter the \\((c, a)\\) pair of \\((0, 0)\\) twice: once
|
|
at the beginning, and once at the end. However, whereas at the beginning we have not yet encountered the addition
|
|
instruction, at the end we have, so the evaluation behaves differently. To make the proofs work better in Coq,
|
|
we'll use a set \\(v\\) of
|
|
{{< sidenote "right" "allowed-note" "allowed (valid) program counters (as opposed to visited program counters)." >}}
|
|
Whereas the set of "visited" program counters keeps growing as our evaluation continues,
|
|
the set of "allowed" program counters keeps shrinking. Because the "allowed" set never stops shrinking,
|
|
assuming we're starting with a finite set, our execution will eventually terminate.
|
|
{{< /sidenote >}}
|
|
|
|
Now we have all the elements of our evaluation. Let's define some notation. A program starts at some state,
|
|
and terminates in another, possibly different state. In the course of a regular evaluation, the program
|
|
never changes; only the state does. So I propose this (rather unorthodox) notation:
|
|
|
|
{{< latex >}}
|
|
(c, a, v) \Rightarrow_p (c', a', v')
|
|
{{< /latex >}}
|
|
|
|
This reads, "after starting at program counter \\(c\\), accumulator \\(a\\), and set of valid addresses \\(v\\),
|
|
the program \\(p\\) terminates with program counter \\(c'\\), accumulator \\(a'\\), and set of valid addresses \\(v'\\)".
|
|
Before creating the inference rules for this evaluation relation, let's define the effect of evaluating a single
|
|
instruction, using notation \\((c, a) \rightarrow_i (c', a')\\). An addition instruction changes the accumulator,
|
|
and increases the program counter by 1.
|
|
|
|
{{< latex >}}
|
|
\frac{}
|
|
{(c, a) \rightarrow_{\texttt{add} \; n} (c+1, a+n)}
|
|
{{< /latex >}}
|
|
|
|
A no-op instruction does even less. All it does is increment the program counter.
|
|
|
|
{{< latex >}}
|
|
\frac{}
|
|
{(c, a) \rightarrow_{\texttt{nop} \; n} (c+1, a)}
|
|
{{< /latex >}}
|
|
|
|
Finally, a jump instruction leaves the accumulator intact, but adds a number to the program counter itself!
|
|
|
|
{{< latex >}}
|
|
\frac{}
|
|
{(c, a) \rightarrow_{\texttt{jmp} \; n} (c+n, a)}
|
|
{{< /latex >}}
|
|
|
|
None of these rules have any premises, and they really are quite simple. Now, let's define the rules
|
|
for evaluating a program. First of all, a program starting in a state that is not considered "valid"
|
|
is done evaluating, and is in a "failed" state.
|
|
|
|
{{< latex >}}
|
|
\frac{c \not \in v \quad c \not= \text{length}(p)}
|
|
{(c, a, v) \Rightarrow_{p} (c, a, v)}
|
|
{{< /latex >}}
|
|
|
|
We use \\(\\text{length}(p)\\) to represent the number of instructions in \\(p\\). Note the second premise:
|
|
even if our program counter \\(c\\) is not included in the valid set, if it's "past the end of the program",
|
|
the program terminates in an "ok" state. Here's a rule for terminating in the "ok" state:
|
|
|
|
{{< latex >}}
|
|
\frac{c = \text{length}(p)}
|
|
{(c, a, v) \Rightarrow_{p} (c, a, v)}
|
|
{{< /latex >}}
|
|
|
|
{{< todo >}}
|
|
We can make this closer to the Coq version.
|
|
{{< /todo >}}
|
|
When our program counter reaches the end of the program, we are also done evaluating it. Even though
|
|
both rules {{< sidenote "right" "redundant-note" "lead to the same conclusion," >}}
|
|
In fact, if the end of the program is never included in the valid set, the second rule is completely redundant.
|
|
{{< /sidenote >}}
|
|
it helps to distinguish the two possible outcomes. Finally, if neither of the termination conditions are met,
|
|
our program can take a step, and continue evaluating from there.
|
|
|
|
{{< latex >}}
|
|
\frac{c \in v \quad p[c] = i \quad (c, a) \rightarrow_i (c', a') \quad (c', a', v - \{c\}) \Rightarrow_p (c'', a'', v'')}
|
|
{(c, a, v) \Rightarrow_{p} (c'', a'', v'')}
|
|
{{< /latex >}}
|
|
|
|
This is quite a rule. A lot of things need to work out for a program to evauate from a state that isn't
|
|
currently the final state:
|
|
|
|
* The current program counter \\(c\\) must be valid. That is, it must be an element of \\(v\\).
|
|
* This program counter must correspond to an instruction \\(i\\) in \\(p\\), which we write as \\(p[c] = i\\).
|
|
* This instruction must be executed, changing our program counter from \\(c\\) to \\(c'\\) and our
|
|
accumulator from \\(a\\) to \\(a'\\). The set of valid instructions will no longer include \\(c\\),
|
|
and will become \\(v - \\{c\\}\\).
|
|
* Our program must then finish executing, starting at state
|
|
\\((c', a', v - \\{c\\})\\), and ending in some (unknown) state \\((c'', a'', v'')\\).
|
|
|
|
If all of these conditions are met, our program, starting at \\((c, a, v)\\), will terminate in the state \\((c'', a'', v'')\\). This third rule completes our semantics; a program being executed will keep running instructions using the third rule, until it finally
|
|
hits an invalid program counter (terminating with the first rule) or gets to the end of the program (terminating with the second rule).
|
|
|
|
#### Aside: Vectors and Finite \\(\mathbb{N}\\)
|
|
We'll be getting to the Coq implementation of our semantics soon, but before we do:
|
|
what type should \\(c\\) be? It's entirely possible for an instruction like \\(\\texttt{jmp} \\; -10000\\)
|
|
to throw our program counter way before the first instruction of our program, so at first, it seems
|
|
as though we should use an integer. But the prompt doesn't even specify what should happen in this
|
|
case - it only says an instruction shouldn't be run twice. The "valid set", although it may help resolve
|
|
this debate, is our invention, and isn't part of the original specification.
|
|
|
|
There is, however, something we can infer from this problem. Since the problem of jumping "too far behind" or
|
|
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to an instruction,
|
|
or right to the end of a program_. This means that \\(c\\) is a natural number, with
|
|
|
|
{{< latex >}}
|
|
0 \leq c \leq \text{length}(p)
|
|
{{< /latex >}}
|
|
|
|
In a language like Coq, it's possible to represent such a number. Since we've gotten familliar with
|
|
inference rules, let's present two rules that define such a number:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{n \in \mathbb{N}^+}
|
|
{Z : \text{Fin} \; n}
|
|
\quad
|
|
\frac
|
|
{f : \text{Fin} \; n}
|
|
{S f : \text{Fin} \; (n+1)}
|
|
{{< /latex >}}
|
|
|
|
This is a variation of the [Peano encoding](https://wiki.haskell.org/Peano_numbers) of natural numbers.
|
|
It reads as follows: zero (\\(Z\\)) is a finite natural number less than any positive natural number \\(n\\). Then, if a finite natural number
|
|
\\(f\\) is less than \\(n\\), then adding one to that number (using the successor function \\(S\\))
|
|
will create a natural number less than \\(n+1\\). We encode this in Coq as follows
|
|
([originally from here](https://coq.inria.fr/library/Coq.Vectors.Fin.html#t)):
|
|
|
|
```Coq
|
|
Inductive t : nat -> Set :=
|
|
| F1 : forall {n}, t (S n)
|
|
| FS : forall {n}, t n -> t (S n).
|
|
```
|
|
|
|
The `F1` constructor here is equivalent to our \\(Z\\), and `FS` is equivalent to our \\(S\\).
|
|
To represent positive natural numbers \\(\\mathbb{N}^+\\), we simply take a regular natural
|
|
number from \\(\mathbb{N}\\) and find its successor using `S` (simply adding 1). Again, we have
|
|
to explicitly use `forall` in our type signatures.
|
|
|
|
We can use a similar technique to represent a list with a known number of elements, known
|
|
in the Idris and Coq world as a vector. Again, we only need two inference rules to define such
|
|
a vector:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{t : \text{Type}}
|
|
{[] : \text{Vec} \; t \; 0}
|
|
\quad
|
|
\frac
|
|
{x : \text{t} \quad \textit{xs} : \text{Vec} \; t \; n}
|
|
{(x::\textit{xs}) : \text{Vec} \; t \; (n+1)}
|
|
{{< /latex >}}
|
|
|
|
These rules read: the empty list \\([]\\) is zero-length vector of any type \\(t\\). Then,
|
|
if we take an element \\(x\\) of type \\(t\\), and an \\(n\\)-long vector \\(\textit{xs}\\) of \\(t\\),
|
|
then we can prepend \\(x\\) to \\(\textit{xs}\\) and get an \\((n+1)\\)-long vector of \\(t\\).
|
|
In Coq, we write this as follows ([originally from here](https://coq.inria.fr/library/Coq.Vectors.VectorDef.html#t)):
|
|
|
|
```Coq
|
|
Inductive t A : nat -> Type :=
|
|
| nil : t A 0
|
|
| cons : forall (h:A) (n:nat), t A n -> t A (S n).
|
|
```
|
|
|
|
The `nil` constructor represents the empty list \\([]\\), and `cons` represents
|
|
the operation of prepending an element (called `h` in the code in \\(x\\) in our inference rules)
|
|
to another vector of length \\(n\\), which is remains unnamed in the code but is called \\(\\textit{xs}\\) in our rules.
|
|
|
|
These two definitions work together quite well. For instance, suppose we have a vector of length \\(n\\).
|
|
If we were to access its elements by indices starting at 0, we'd be allowed to access indices 0 through \\(n-1\\).
|
|
These are precisely the values of the finite natural numbers less than \\(n\\), \\(\\text{Fin} \\; n \\).
|
|
Thus, given such an index \\(\\text{Fin} \\; n\\) and a vector \\(\\text{Vec} \\; t \\; n\\), we are guaranteed
|
|
to be able to retrieve the element at the given index! In our code, we will not have to worry about bounds checking.
|
|
|
|
Of course, if our program has \\(n\\) elements, our program counter will be a finite number less than \\(n+1\\),
|
|
since there's always the possibility of it pointing past the instructions, indicating that we've finished
|
|
running the program. This leads to some minor complications: we can't safely access the program instruction
|
|
at index \\(\\text{Fin} \\; (n+1)\\). We can solve this problem by considering two cases:
|
|
either our index points one past the end of the program (in which case its value is exactly the finite
|
|
representation of \\(n\\)), or it's less than \\(n\\), in which case we can "tighten" the upper bound,
|
|
and convert that index into a \\(\\text{Fin} \\; n\\). We formalize it in a lemma:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}}
|
|
|
|
{{< todo >}}Prove this (at least informally) {{< /todo >}}
|
|
|
|
There's a little bit of a gotcha here. Instead of translating our above statement literally,
|
|
and returning a value that's the result of "tightening" our input `f`, we return a value
|
|
`f'` that can be "weakened" to `f`. This is because "tightening" is not a total function -
|
|
it's not always possible to convert a \\(\\text{Fin} \\; (n+1)\\) into a \\(\\text{Fin} \\; n\\).
|
|
However, "weakening" \\(\\text{Fin} \\; n\\) _is_ a total function, since a number less than \\(n\\)
|
|
is, by the transitive property of a total order, also less than \\(n+1\\).
|
|
|
|
Next, let's talk about addition, specifically the kind of addition done by the \\(\\texttt{jmp}\\) instruction.
|
|
We can always add an integer to a natural number, but we can at best guarantee that the result
|
|
will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural
|
|
number. We implement this kind of addition in a function called `jump_t`:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 56 56 >}}
|
|
|
|
At the moment, its definition is not particularly important. What is important, though,
|
|
is that it takes a bounded natural number `pc` (our program counter), an integer `off`
|
|
(the offset provided by the jump instruction) and returns another integer representing
|
|
the final offset. Why are integers of type `t`? Well, it so happens
|
|
that Coq provides facilities for working with arbitrary implementations of integers,
|
|
without relying on how they are implemented under the hood. This can be seen in its
|
|
[`Coq.ZArith.Int`](https://coq.inria.fr/library/Coq.ZArith.Int.html) module,
|
|
which describes what functions and types an implementation of integers should provide.
|
|
Among those is `t`, the type an integer in such an arbitrary implementation. We too
|
|
will not make an assumption about how the integers are implemented, and simply
|
|
use this generic `t` from now on.
|
|
|
|
Now, suppose we wanted to write a function that _does_ return a valid program
|
|
counter after adding the offset to it. Since it's possible for this function to fail
|
|
(for instance, if the offset is very negative), it has to return `option (fin (S n))`.
|
|
That is, this function may either fail (returning `None`) or succeed, returning
|
|
`Some f`, where `f` is of type `fin (S n)`, aka \\(\\text{Fin} \\; (n + 1)\\). Here's
|
|
the function in Coq (again, don't worry too much about the definition):
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 61 61 >}}
|
|
|
|
We will make use of this function when we define and verify our semantics.
|
|
Let's take a look at that next.
|
|
|
|
#### Semantics in Coq
|
|
|
|
Now that we've seen finite sets and vectors, it's time to use them to
|
|
encode our semantics in Coq. Before we do anything else, we need
|
|
to provide Coq definitions for the various components of our
|
|
language, much like what we did with `tinylang`. We can start with opcodes:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 20 23 >}}
|
|
|
|
Now we can define a few other parts of our language and semantics, namely
|
|
states, instructions and programs (which I called "inputs" since, we'll, they're
|
|
our puzzle input). A state is simply the 3-tuple of the program counter, the set
|
|
of valid program counters, and the accumulator. We write it as follows:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 33 33 >}}
|
|
|
|
The star `*` is used here to represent a [product type](https://en.wikipedia.org/wiki/Product_type)
|
|
rather than arithmetic multiplication. Our state type accepts an argument,
|
|
`n`, much like a finite natural number or a vector. In fact, this `n` is passed on
|
|
to the state's program counter and set types. Rightly, a state for a program
|
|
of length \\(n\\) will not be of the same type as a state for a program of length \\(n+1\\).
|
|
|
|
An instruction is also a tuple, but this time containing only two elements: the opcode and
|
|
the number. We write this as follows:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 36 36 >}}
|
|
|
|
Finally, we have to define the type of a program. This type will also be
|
|
indexed by `n`, the program's length. A program of length `n` is simply a
|
|
vector of instructions `inst` of length `n`. This leads to the following
|
|
definition:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 38 38 >}}
|
|
|
|
So far, so good! Finally, it's time to get started on the semantics themselves.
|
|
We begin with the inductive definition of \\((\\rightarrow_i)\\).
|
|
I think this is fairly straightforward. However, we do use
|
|
`t` instead of \\(n\\) from the rules, and we use `FS`
|
|
instead of \\(+1\\). Also, we make the formerly implicit
|
|
assumption that \\(c+n\\) is valid explicit, by
|
|
providing a proof that `valid_jump_t pc t = Some pc'`.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}}
|
|
|
|
Next, it will help us to combine the premises for a
|
|
"failed" and "ok" terminations into Coq data types.
|
|
This will make it easier for us to formulate a lemma later on.
|
|
Here are the definitions:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 112 117 >}}
|
|
|
|
Since all of out "termination" rules start and
|
|
end in the same state, there's no reason to
|
|
write that state twice. Thus, both `done`
|
|
and `stuck` only take the input `inp`,
|
|
and the state, which includes the accumulator
|
|
`acc`, set of allowed program counters `v`, and
|
|
the program counter at which the program came to an end.
|
|
When the program terminates successfully, this program
|
|
counter will be equal to the length of the program `n`,
|
|
so we use `nat_to_fin n`. On the other hand, if the program
|
|
terminates in as stuck state, it must be that it terminated
|
|
at a program counter that points to an instruction. Thus, this
|
|
program counter is actually a \\(\\text{Fin} \\; n\\), and not
|
|
a \\(\\text{Fin} \\ (n+1)\\), and is not in the set of allowed program counters.
|
|
We use the same "weakening" trick we saw earlier to represent
|
|
this.
|
|
|
|
Finally, we encode the three inference rules we came up with:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 119 126 >}}
|
|
|
|
Notice that we fused two of the premises in the last rule.
|
|
Instead of naming the instruction at the current program
|
|
counter and using it in another premise, we simply use
|
|
`nth inp pc`, which corresponds to \\(p[c]\\) in our
|
|
"paper" semantics.
|
|
|
|
Before we go on writing some actual proofs, we have
|
|
one more thing we have to address. Earlier, we said:
|
|
|
|
> All jumps will lead either to an instruction, or right to the end of a program.
|
|
|
|
To make Coq aware of this constraint, we'll have to formalize it. To
|
|
start off, we'll define the notion of a "valid instruction", which is guaranteed
|
|
to keep the program counter in the correct range.
|
|
There are a couple of ways to do this, but we'll use yet another definition based
|
|
on inference rules. First, though, observe that the same instruction may be valid
|
|
for one program, and invalid for another. For instance, \\(\\texttt{jmp} \\; 100\\)
|
|
is perfectly valid for a program with thousands of instructions, but if it occurs
|
|
in a program with only 3 instructions, it will certainly lead to disaster. Specifically,
|
|
the validity of an instruction depends on the length of the program in which it resides,
|
|
and the program counter at which it's encountered.
|
|
Thus, we refine our idea of validity to "being valid for a program of length \\(n\\) at program counter \\(f\\)".
|
|
For this, we can use the following two inference rules:
|
|
|
|
{{< latex >}}
|
|
\frac
|
|
{c : \text{Fin} \; n}
|
|
{\texttt{acc} \; t \; \text{valid for} \; n, c }
|
|
\quad
|
|
\frac
|
|
{c : \text{Fin} \; n \quad o \in \{\texttt{nop}, \texttt{jmp}\} \quad J_v(c, t) = \text{Some} \; c' }
|
|
{o \; t \; \text{valid for} \; n, c }
|
|
{{< /latex >}}
|
|
|
|
The first rule states that if a program has length \\(n\\), then it's valid
|
|
at any program counter whose value is less than \\(n\\). This is because running
|
|
\\(\\texttt{add}\\) will increment the program counter \\(c\\) by 1,
|
|
and thus, create a new program counter that's less than \\(n+1\\),
|
|
which, as we discussed above, is perfectly valid.
|
|
|
|
The second rule works for the other two instructions. It has an extra premise:
|
|
the result of `jump_valid_t` (written as \\(J_v\\)) has to be \\(\\text{Some} \\; c'\\),
|
|
that is, `jump_valid_t` must succeed. Note that we require this even for no-ops,
|
|
since it later turns out of the them may be a jump after all.
|
|
|
|
We now have our validity rules. If an instruction satisfies them for a given program
|
|
and at a given program counter, evaluating it will always result in a program counter that has a proper value.
|
|
We encode the rules in Coq as follows:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 152 157 >}}
|
|
|
|
Note that we have three rules instead of two. This is because we "unfolded"
|
|
\\(o\\) from our second rule: rather than using set notation (or "or"), we
|
|
just generated two rules that vary in nothing but the operation involved.
|
|
|
|
Of course, we must have that every instruction in a program is valid.
|
|
We don't really need inference rules for this, as much as a "forall" quantifier.
|
|
A program \\(p\\) of length \\(n\\) is valid if the following holds:
|
|
|
|
{{< latex >}}
|
|
\forall (c : \text{Fin} \; n). p[c] \; \text{valid for} \; n, c
|
|
{{< /latex >}}
|
|
|
|
That is, for every possible in-bounds program counter \\(c\\),
|
|
the instruction at the program counter is valid. We can now
|
|
encode this in Coq, too:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 160 161 >}}
|
|
|
|
In the above, `n` is made implicit where possible.
|
|
Since \\(c\\) (called `pc` in the code) is of type \\(\\text{Fin} \\; n\\), there's no
|
|
need to write \\(n\\) _again_. The curly braces tell Coq to infer that
|
|
argument where possible.
|
|
|
|
### Proving Termination
|
|
Here we go! It's finally time to make some claims about our
|
|
definitions. Who knows - maybe we wrote down total garbage!
|
|
We will be creating several related lemmas and theorems.
|
|
All of them share two common assumptions:
|
|
|
|
* We have some valid program `inp` of length `n`.
|
|
* This program is a valid input, that is, `valid_input` holds for it.
|
|
There's no sense in arguing based on an invalid input program.
|
|
|
|
We represent these grouped assumptions by opening a Coq
|
|
`Section`, which we call `ValidInput`, and listing our assumptions:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 163 166 >}}
|
|
|
|
We had to also explicitly mention the length `n` of our program.
|
|
From now on, the variables `n`, `inp`, and `Hv` will be
|
|
available to all of the proofs we write in this section.
|
|
The first proof is rather simple. The claim is:
|
|
|
|
> For our valid program, at any program counter `pc`
|
|
and accumulator `acc`, there must exists another program
|
|
counter `pc'` and accumulator `acc'` such that the
|
|
instruction evaluation relation \\((\rightarrow_i)\\)
|
|
connects the two. That is, valid addresses aside,
|
|
we can always make a step.
|
|
|
|
Here is this claim encoded in Coq:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 168 169 >}}
|
|
|
|
We start our proof by introducing all the relevant variables into
|
|
the global context. I've mentioned this when I wrote about
|
|
day 1, but here's the gist: the `intros` keyword takes
|
|
variables from a `forall`, and makes them concrete.
|
|
In short, `intros x` is very much like saying "suppose
|
|
we have an `x`", and going on with the proof.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 170 171 >}}
|
|
|
|
Here, we said "take any program counter `pc` and any
|
|
accumulator `acc`". Now what? Well, first of all,
|
|
we want to take a look at the instruction at the current
|
|
`pc`. We know that this instruction is a combination
|
|
of an opcode and a number, so we use `destruct` to get
|
|
access to both of these parts:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 172 172 >}}
|
|
|
|
Now, Coq reports the following proof state:
|
|
|
|
```
|
|
1 subgoal
|
|
|
|
n : nat
|
|
inp : input n
|
|
Hv : valid_input inp
|
|
pc : Fin.t n
|
|
acc : t
|
|
o : opcode
|
|
t0 : t
|
|
Hop : nth inp pc = (o, t0)
|
|
|
|
========================= (1 / 1)
|
|
|
|
exists (pc' : fin (S n)) (acc' : t),
|
|
step_noswap (o, t0) (pc, acc) (pc', acc')
|
|
```
|
|
|
|
We have some opcode `o`, and some associated number
|
|
`t0`, and we must show that there exist a `pc'`
|
|
and `acc'` to which we can move on. To prove
|
|
that something exists in Coq, we must provide
|
|
an instance of that "something". If we claim
|
|
that there exists a dog that's not a good boy,
|
|
we better have this elusive creature in hand.
|
|
In other words, proofs in Coq are [constructive](https://en.wikipedia.org/wiki/Constructive_proof).
|
|
Without knowing the kind of operation we're dealing with, we can't
|
|
say for sure how the step will proceed. Thus, we proceed by
|
|
case analysis on `o`.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 173 173 >}}
|
|
|
|
There are three possible cases we have to consider,
|
|
one for each type of instruction.
|
|
|
|
* If the instruction is \\(\\texttt{add}\\), we know
|
|
that `pc' = pc + 1` and `acc' = acc + t0`. That is,
|
|
the program counter is simply incremented, and the accumulator
|
|
is modified with the number part of the instruction.
|
|
* If the instruction is \\(\\texttt{nop}\\), the program
|
|
coutner will again be incremented (`pc' = pc + 1`),
|
|
but the accumulator will stay the same, so `acc' = acc`.
|
|
* If the instruction is \\(\\texttt{jmp}\\), things are
|
|
more complicated. We must rely on the assumption
|
|
that our input is valid, which tells us that adding
|
|
`t0` to our `pc` will result in `Some f`, and not `None`.
|
|
Given this, we have `pc' = f`, and `acc' = acc`.
|
|
|
|
This is how these three cases are translated to Coq:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 174 177 >}}
|
|
|
|
For the first two cases, we simply provide the
|
|
values we expect for `pc'` and `acc'`, and
|
|
apply the corresponding inference rule that
|
|
is satisfied by these values. For the third case, we have
|
|
to invoke `Hv`, the hypothesis that our input is valid.
|
|
In particular, we care about the instruction at `pc`,
|
|
so we use `specialize` to plug `pc` into the more general
|
|
hypothesis. We then replace `nth inp pc` with its known
|
|
value, `(jmp, t0)`. This tells us the following, in Coq's words:
|
|
|
|
```
|
|
Hv : valid_inst (jmp, t0) pc
|
|
```
|
|
|
|
That is, `(jmp, t0)` is a valid instruction at `pc`. Then, using
|
|
Coq's `inversion` tactic, we ask: how is this possible? There is
|
|
only one inference rule that gives us such a conclusion, and it is named `valid_inst_jmp`
|
|
in our Coq code. Since we have a proof that our `jmp` is valid,
|
|
it must mean that this rule was used. Furthermore, sicne this
|
|
rule requires that `valid_jump_t` evaluates to `Some f'`, we know
|
|
that this must be the case here! Coq now has adds the following
|
|
two lines to our proof state:
|
|
|
|
```
|
|
f' : fin (S n)
|
|
H0 : valid_jump_t pc t0 = Some f'
|
|
```
|
|
|
|
Finally, we specify, as mentioned earlier, that `pc' = f'` and `acc' = acc`.
|
|
As before, we apply the corresponding step rule for `jmp`. When it asks
|
|
for a proof that `valid_jump_t` produces a valid program counter,
|
|
we hand it `H0` using `apply H0`. And with that, Coq is happy!
|
|
|
|
Next, we prove a claim that a valid program can always do _something_,
|
|
and that something is one of three things:
|
|
|
|
* It can terminate in the "ok" state if the program counter
|
|
reaches the programs' end.
|
|
* It can terminate with an error if it's currently at a program
|
|
counter that is not included in the valid set.
|
|
* Otherwise, it can run the current instruction and advance
|
|
to a "next" state.
|
|
|
|
Alternatively, we could say that one of the inference rules
|
|
for \\((\\Rightarrow_p)\\) must apply. This is not the case if the input
|
|
is not valid, since, as I said
|
|
before, an arbitrary input program can lead us to jump
|
|
to a negative address (or to an address _way_ past the end of the program).
|
|
Here's the claim, translated to Coq:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 181 186 >}}
|
|
|
|
Informally, we can prove this as follows:
|
|
|
|
* If the current program counter is equal to the length
|
|
of the program, we've reached the end. Thus, the program
|
|
can terminate in the "ok" state.
|
|
* Otherwise, the current program counter must be
|
|
less than the length of the program.
|
|
* If we've already encountered this program counter (that is,
|
|
if it's gone from the set of valid program counters),
|
|
then the program will terminate in the "error" state.
|
|
* Otherwise, the program counter is in the set of
|
|
valid instructions. By our earlier theorem, in a valid
|
|
program, the instruction at any program counter can be correctly
|
|
executed, taking us to the next state. Now too
|
|
our program can move to this next state.
|
|
|
|
Below is the Coq translation of the above.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 187 203 >}}
|
|
|
|
It doesn't seem like we're that far from being done now.
|
|
A program can always take a step, and each time it does,
|
|
the set of valid program counters decreases in size. Eventually,
|
|
this set will become empty, so if nothing else, our program will
|
|
eventually terminate in an "error" state. Thus, it will stop
|
|
running no matter what.
|
|
|
|
This seems like a task for induction, in this case on the size
|
|
of the valid set. In particular, strong mathematical induction
|
|
{{< sidenote "right" "strong-induction-note" "seem to work best." >}}
|
|
Why strong induction? If we remove a single element from a set,
|
|
its size should decrease strictly by 1. Thus, why would we need
|
|
to care about sets of <em>all</em> sizes less than the current
|
|
set's size?<br>
|
|
<br>
|
|
Unfortunately, we're not working with purely mathematical sets.
|
|
Coq's default facility for sets is simply a layer on top
|
|
of good old lists, and makes no effort to be "correct by construction".
|
|
It is thus perfectly possible to have a "set" which inlcudes an element
|
|
twice. Depending on the implementation of <code>set_remove</code>,
|
|
we may end up removing the repeated element multiple times, thereby
|
|
shrinking the length of our list by more than 1. I'd rather
|
|
not worry about implementation details like that.
|
|
{{< /sidenote >}}
|
|
Someone on StackOverflow [implemented this](https://stackoverflow.com/questions/45872719/how-to-do-induction-on-the-length-of-a-list-in-coq),
|
|
so I'll just use it. The Coq theorem corresonding to strong induction
|
|
on the length of a list is as follows:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 205 207 >}}
|
|
|
|
It reads,
|
|
|
|
> If for some list `l`, the property `P` holding for all lists
|
|
shorter than `l` means that it also holds for `l` itself, then
|
|
`P` holds for all lists.
|
|
|
|
This is perhaps not particularly elucidating. We can alternatively
|
|
think of this as trying to prove some property for all lists `l`.
|
|
We start with all empty lists. Here, we have nothing else to rely
|
|
on; there are no lists shorter than the empty list, and our property
|
|
must hold for all empty lists. Then, we move on to proving
|
|
the property for all lists of length 1, already knowing that it holds
|
|
for all empty lists. Once we're done there, we move on to proving
|
|
that `P` holds for all lists of length 2, now knowing that it holds
|
|
for all empty lists _and_ all lists of length 1. We continue
|
|
doing this, eventually covering lists of any length.
|
|
|
|
Before proving termination, there's one last thing we have to
|
|
take care off. Coq's standard library does not come with
|
|
a proof that removing an element from a set makes it smaller;
|
|
we have to provide it ourselves. Here's the claim encoded
|
|
in Coq:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 217 219 >}}
|
|
|
|
This reads, "if a set `s` contains a finite natural
|
|
number `f`, removing `f` from `s` reduces the set's size".
|
|
The details of the proof are not particularly interesting,
|
|
and I hope that you understand intuitively why this is true.
|
|
Finally, we make our termination claim.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 230 231 >}}
|
|
|
|
It's quite a strong claim - given _any_ program counter,
|
|
set of valid addresses, and accumulator, a valid input program
|
|
will terminate. Let's take a look at the proof.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 232 234 >}}
|
|
|
|
We use `intros` again. However, it brings in variables
|
|
in order, and we really only care about the _second_ variable.
|
|
We thus `intros` the first two, and then "put back" the first
|
|
one using `generalize dependent`. Then, we proceed by
|
|
induction on length, as seen above.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 235 236>}}
|
|
|
|
Now we're in the "inductive step". Our inductive hypothesis
|
|
is that any set of valid addresses smaller than the current one will
|
|
guarantee that the program will terminate. We must show
|
|
that using our set, too, will guarantee termination. We already
|
|
know that a valid input, given a state, can have one of three
|
|
possible outcomes: "ok" termination, "failed" termination,
|
|
or a "step". We use `destruct` to take a look at each of these
|
|
in turn. The first two cases ("ok" termination and "failed" termination)
|
|
are fairly trivial:
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 237 240 >}}
|
|
|
|
We basically connect the dots between the premises (in a form like `done`)
|
|
and the corresponding inference rule (`run_noswap_done`). The more
|
|
interesting case is when we can take a step.
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 241 253 >}}
|
|
|
|
Since we know we can take a step, we know that we'll be removing
|
|
the current program counter from the set of valid addresses. This
|
|
set must currently contain the present program counter (since otherwise
|
|
we'd have "failed"), and thus will shrink when we remove it. This,
|
|
in turn, lets us use the inductive hypothesis: it tells us that no matter the
|
|
program counter or accumulator, if we start with this new "shrunk"
|
|
set, we will terminate in some state. Coq's constructive
|
|
nature helps us here: it doesn't just tells us that there is some state
|
|
in which we terminate - it gives us that state! We use `edestruct` to get
|
|
a handle on this final state, which Coq automatically names `x`. At this
|
|
time Coq still isn't convinced that our new set is smaller, so we invoke
|
|
our earlier `set_remove_length` theorem to placate it.
|
|
|
|
We now have all the pieces: we know that we can take a step, removing
|
|
the current program counter from our current set. We also know that
|
|
with that newly shrunken set, we'll terminate in some final state `x`.
|
|
Thus, all that's left to say is to apply our "step" rule. It asks
|
|
us for three things:
|
|
|
|
1. That the current program counter is in the set. We've long since
|
|
established this, and `auto` takes care of that.
|
|
2. That a step is possible. We've already established this, too,
|
|
since we're in the "can take a step" case. We apply `Hst`,
|
|
the hypothesis that confirms that we can, indeed, step.
|
|
3. That we terminate after this. The `x` we got
|
|
from our induction hypothesis came with a proof that
|
|
running with the "next" program counter and accumulator
|
|
will result in termination. We apply this proof, automatically
|
|
named `H0` by Coq.
|
|
|
|
And that's it! We've proved that a program terminates no matter what.
|
|
This has also (almost!) given us a solution to part 1. Consider the case
|
|
in which we start with program counter 0, accumulator 0, and the "full"
|
|
set of allowed program counters. Since our proof works for _all_ configurations,
|
|
it will also work for this one. Furthermore, since Coq proofs are constructive,
|
|
this proof will __return to us the final program counter and accumulator!__
|
|
This is precisely what we'd need to solve part 1.
|
|
|
|
But wait, almost? What's missing? We're missing a few implementation details:
|
|
* We've not provided a concrete impelmentation of integers.
|
|
* We assumed (reasonably, I would say) that it's possible to convert a natural
|
|
number to an integer.
|
|
* We also assumed (still reasonably) that we can try convert an integer
|
|
back to a finite natural number, failing if it's too small or too large.
|
|
|
|
{{< todo >}}Finish up{{< /todo >}}
|