Compare commits
4 Commits
60eb50737d
...
43dfee56cc
Author | SHA1 | Date | |
---|---|---|---|
43dfee56cc | |||
6f9a2ce092 | |||
06014eade9 | |||
6f92a50c83 |
3
.gitmodules
vendored
Normal file
3
.gitmodules
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
[submodule "code/aoc-2020"]
|
||||
path = code/aoc-2020
|
||||
url = https://dev.danilafe.com/Advent-of-Code/AdventOfCode-2020.git
|
1
code/aoc-2020
Submodule
1
code/aoc-2020
Submodule
@ -0,0 +1 @@
|
||||
Subproject commit 661bcbb557bfd7c228341b2430efd5aa21de3878
|
@ -26,7 +26,7 @@ let's write a helper function that, given a number `x`, tries to find another nu
|
||||
`y` such that `x + y = 2020`. In fact, rather than hardcoding the desired
|
||||
sum to `2020`, let's just use another argument called `total`. The code is quite simple:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 7 14 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 11 18 >}}
|
||||
|
||||
Here, `is` is the list of numbers that we want to search.
|
||||
We proceed by case analysis: if the list is empty, we can't
|
||||
@ -43,7 +43,7 @@ for our purposes when the argument being case analyzed is given first.
|
||||
We can now use `find_matching` to define our `find_sum` function, which solves part 1.
|
||||
Here's the code:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 16 24 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 20 28 >}}
|
||||
|
||||
For every `x` that we encounter in our input list `is`, we want to check if there's
|
||||
a matching number in the rest of the list. We only search the remainder of the list
|
||||
@ -72,13 +72,13 @@ formally as follows:
|
||||
|
||||
And this is how we write it in Coq:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 26 27 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 30 31 >}}
|
||||
|
||||
The arrow, `->`, reads "implies". Other than that, I think this
|
||||
property reads pretty well. The proof, unfortunately, is a little bit more involved.
|
||||
Here are the first few lines:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 28 31 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 32 35 >}}
|
||||
|
||||
We start with the `intros is` tactic, which is akin to saying
|
||||
"consider a particular list of integers `is`". We do this without losing
|
||||
@ -157,7 +157,7 @@ is zero. This means we're done with the base case!
|
||||
The inductive case is the meat of this proof. Here's the corresponding part
|
||||
of the Coq source file:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 32 36 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 36 40 >}}
|
||||
|
||||
This time, the proof state is more complicated:
|
||||
|
||||
@ -284,14 +284,14 @@ Coq proofs is to actually step through them in the IDE!
|
||||
|
||||
First on the list is `find_matching_skip`. Here's the type:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 38 39 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 42 43 >}}
|
||||
|
||||
It reads: if we correctly find a number in a small list `is`, we can find that same number
|
||||
even if another number is prepended to `is`. That makes sense: _adding_ a number to
|
||||
a list doesn't remove whatever we found in it! I used this lemma to prove another,
|
||||
`find_matching_works`:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 49 50 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 53 54 >}}
|
||||
|
||||
This reads, if there _is_ an element `y` in `is` that adds up to `k` with `x`, then
|
||||
`find_matching` will find it. This is an important property. After all, if it didn't
|
||||
@ -310,7 +310,7 @@ that all lists from this Advent of Code puzzle are guaranteed to have two number
|
||||
add up to our goal, and that these numbers are not equal to each other. In Coq,
|
||||
we state this as follows:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 4 5 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 8 9 >}}
|
||||
|
||||
This defines a new property, `has_pair t is` (read "`is` has a pair of numbers that add to `t`"),
|
||||
which means:
|
||||
@ -323,7 +323,7 @@ which means:
|
||||
When making claims about the correctness of our algorithm, we will assume that this
|
||||
property holds. Finally, here's the theorem we want to prove:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 64 66 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 68 70 >}}
|
||||
|
||||
It reads, "for any total `k` and list `is`, if `is` has a pair of numbers that add to `k`,
|
||||
then `find_sum` will return a pair of numbers `x` and `y` that add to `k`".
|
||||
@ -335,7 +335,7 @@ we want to confirm that `find_sum` will find one of them. Finally, here is the p
|
||||
I will not be able to go through it in detail in this post, but I did comment it to
|
||||
make it easier to read:
|
||||
|
||||
{{< codelines "Coq" "aoc-coq/day1.v" 67 102 >}}
|
||||
{{< codelines "Coq" "aoc-2020/day1.v" 71 106 >}}
|
||||
|
||||
Coq seems happy with it, and so am I! The bug I mentioned earlier popped up on line 96.
|
||||
I had accidentally made `find_sum` return `None` if it couldn't find a complement
|
||||
|
@ -249,6 +249,9 @@ the program terminates in an "ok" state. Here's a rule for terminating in the "o
|
||||
{(c, a, v) \Rightarrow_{p} (c, a, v)}
|
||||
{{< /latex >}}
|
||||
|
||||
{{< todo >}}
|
||||
We can make this closer to the Coq version.
|
||||
{{< /todo >}}
|
||||
When our program counter reaches the end of the program, we are also done evaluating it. Even though
|
||||
both rules {{< sidenote "right" "redundant-note" "lead to the same conclusion," >}}
|
||||
In fact, if the end of the program is never included in the valid set, the second rule is completely redundant.
|
||||
@ -274,3 +277,237 @@ and will become \\(v - \\{c\\}\\).
|
||||
|
||||
If all of these conditions are met, our program, starting at \\((c, a, v)\\), will terminate in the state \\((c'', a'', v'')\\). This third rule completes our semantics; a program being executed will keep running instructions using the third rule, until it finally
|
||||
hits an invalid program counter (terminating with the first rule) or gets to the end of the program (terminating with the second rule).
|
||||
|
||||
#### Aside: Vectors and Finite \\(\mathbb{N}\\)
|
||||
We'll be getting to the Coq implementation of our semantics soon, but before we do:
|
||||
what type should \\(c\\) be? It's entirely possible for an instruction like \\(\\texttt{jmp} \\; -10000\\)
|
||||
to throw our program counter way before the first instruction of our program, so at first, it seems
|
||||
as though we should use an integer. But the prompt doesn't even specify what should happen in this
|
||||
case - it only says an instruction shouldn't be run twice. The "valid set", although it may help resolve
|
||||
this debate, is our invention, and isn't part of the original specification.
|
||||
|
||||
There is, however, something we can infer from this problem. Since the problem of jumping "too far behind" or
|
||||
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to an instruction,
|
||||
or right to the end of a program_. This means that \\(c\\) is a natural number, with
|
||||
|
||||
{{< latex >}}
|
||||
0 \leq c \leq \text{length}(p)
|
||||
{{< /latex >}}
|
||||
|
||||
In a language like Coq, it's possible to represent such a number. Since we've gotten familliar with
|
||||
inference rules, let's present two rules that define such a number:
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{n \in \mathbb{N}^+}
|
||||
{Z : \text{Fin} \; n}
|
||||
\quad
|
||||
\frac
|
||||
{f : \text{Fin} \; n}
|
||||
{S f : \text{Fin} \; (n+1)}
|
||||
{{< /latex >}}
|
||||
|
||||
This is a variation of the [Peano encoding](https://wiki.haskell.org/Peano_numbers) of natural numbers.
|
||||
It reads as follows: zero (\\(Z\\)) is a finite natural number less than any positive natural number \\(n\\). Then, if a finite natural number
|
||||
\\(f\\) is less than \\(n\\), then adding one to that number (using the successor function \\(S\\))
|
||||
will create a natural number less than \\(n+1\\). We encode this in Coq as follows
|
||||
([originally from here](https://coq.inria.fr/library/Coq.Vectors.Fin.html#t)):
|
||||
|
||||
```Coq
|
||||
Inductive t : nat -> Set :=
|
||||
| F1 : forall {n}, t (S n)
|
||||
| FS : forall {n}, t n -> t (S n).
|
||||
```
|
||||
|
||||
The `F1` constructor here is equivalent to our \\(Z\\), and `FS` is equivalent to our \\(S\\).
|
||||
To represent positive natural numbers \\(\\mathbb{N}^+\\), we simply take a regular natural
|
||||
number from \\(\mathbb{N}\\) and find its successor using `S` (simply adding 1). Again, we have
|
||||
to explicitly use `forall` in our type signatures.
|
||||
|
||||
We can use a similar technique to represent a list with a known number of elements, known
|
||||
in the Idris and Coq world as a vector. Again, we only need two inference rules to define such
|
||||
a vector:
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{t : \text{Type}}
|
||||
{[] : \text{Vec} \; t \; 0}
|
||||
\quad
|
||||
\frac
|
||||
{x : \text{t} \quad \textit{xs} : \text{Vec} \; t \; n}
|
||||
{(x::\textit{xs}) : \text{Vec} \; t \; (n+1)}
|
||||
{{< /latex >}}
|
||||
|
||||
These rules read: the empty list \\([]\\) is zero-length vector of any type \\(t\\). Then,
|
||||
if we take an element \\(x\\) of type \\(t\\), and an \\(n\\)-long vector \\(\textit{xs}\\) of \\(t\\),
|
||||
then we can prepend \\(x\\) to \\(\textit{xs}\\) and get an \\((n+1)\\)-long vector of \\(t\\).
|
||||
In Coq, we write this as follows ([originally from here](https://coq.inria.fr/library/Coq.Vectors.VectorDef.html#t)):
|
||||
|
||||
```Coq
|
||||
Inductive t A : nat -> Type :=
|
||||
| nil : t A 0
|
||||
| cons : forall (h:A) (n:nat), t A n -> t A (S n).
|
||||
```
|
||||
|
||||
The `nil` constructor represents the empty list \\([]\\), and `cons` represents
|
||||
the operation of prepending an element (called `h` in the code in \\(x\\) in our inference rules)
|
||||
to another vector of length \\(n\\), which is remains unnamed in the code but is called \\(\\textit{xs}\\) in our rules.
|
||||
|
||||
These two definitions work together quite well. For instance, suppose we have a vector of length \\(n\\).
|
||||
If we were to access its elements by indices starting at 0, we'd be allowed to access indices 0 through \\(n-1\\).
|
||||
These are precisely the values of the finite natural numbers less than \\(n\\), \\(\\text{Fin} \\; n \\).
|
||||
Thus, given such an index \\(\\text{Fin} \\; n\\) and a vector \\(\\text{Vec} \\; t \\; n\\), we are guaranteed
|
||||
to be able to retrieve the element at the given index! In our code, we will not have to worry about bounds checking.
|
||||
|
||||
Of course, if our program has \\(n\\) elements, our program counter will be a finite number less than \\(n+1\\),
|
||||
since there's always the possibility of it pointing past the instructions, indicating that we've finished
|
||||
running the program. This leads to some minor complications: we can't safely access the program instruction
|
||||
at index \\(\\text{Fin} \\; (n+1)\\). We can solve this problem by considering two cases:
|
||||
either our index points one past the end of the program (in which case its value is exactly the finite
|
||||
representation of \\(n\\)), or it's less than \\(n\\), in which case we can "tighten" the upper bound,
|
||||
and convert that index into a \\(\\text{Fin} \\; n\\). We formalize it in a lemma:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}}
|
||||
|
||||
{{< todo >}}Prove this (at least informally) {{< /todo >}}
|
||||
|
||||
There's a little bit of a gotcha here. Instead of translating our above statement literally,
|
||||
and returning a value that's the result of "tightening" our input `f`, we return a value
|
||||
`f'` that can be "weakened" to `f`. This is because "tightening" is not a total function -
|
||||
it's not always possible to convert a \\(\\text{Fin} \\; (n+1)\\) into a \\(\\text{Fin} \\; n\\).
|
||||
However, "weakening" \\(\\text{Fin} \\; n\\) _is_ a total function, since a number less than \\(n\\)
|
||||
is, by the transitive property of a total order, also less than \\(n+1\\).
|
||||
|
||||
Next, let's talk about addition, specifically the kind of addition done by the \\(\\texttt{jmp}\\) instruction.
|
||||
We can always add an integer to a natural number, but we can at best guarantee that the result
|
||||
will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural
|
||||
number. We implement this kind of addition in a function called `jump_t`:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 56 56 >}}
|
||||
|
||||
At the moment, its definition is not particularly important. What is important, though,
|
||||
is that it takes a bounded natural number `pc` (our program counter), an integer `off`
|
||||
(the offset provided by the jump instruction) and returns another integer representing
|
||||
the final offset. Why are integers of type `t`? Well, it so happens
|
||||
that Coq provides facilities for working with arbitrary implementations of integers,
|
||||
without relying on how they are implemented under the hood. This can be seen in its
|
||||
[`Coq.ZArith.Int`](https://coq.inria.fr/library/Coq.ZArith.Int.html) module,
|
||||
which describes what functions and types an implementation of integers should provide.
|
||||
Among those is `t`, the type an integer in such an arbitrary implementation. We too
|
||||
will not make an assumption about how the integers are implemented, and simply
|
||||
use this generic `t` from now on.
|
||||
|
||||
#### Semantics in Coq
|
||||
|
||||
Now that we've seen finite sets and vectors, it's time to use them to
|
||||
encode our semantics in Coq. Let's start with jumps. Suppose we wanted to write a function that _does_ return a valid program
|
||||
counter after adding the offset to it. Since it's possible for this function to fail
|
||||
(for instance, if the offset is very negative), it has to return `option (fin (S n))`.
|
||||
That is, this function may either fail (returning `None`) or succeed, returning
|
||||
`Some f`, where `f` is of type `fin (S n)`, aka \\(\\text{Fin} \\; (n + 1)\\). Here's
|
||||
the function in Coq (again, don't worry too much about the definition):
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 61 61 >}}
|
||||
|
||||
But earlier, didn't we say:
|
||||
|
||||
> All jumps will lead either to an instruction, or right to the end of a program.
|
||||
|
||||
To make Coq aware of this constraint, we'll have to formalize this notion. To
|
||||
start off, we'll define the notion of a "valid instruction", which is guaranteed
|
||||
to keep the program counter in the correct range.
|
||||
There are a couple of ways to do this, but we'll use yet another definition based
|
||||
on inference rules. First, though, observe that the same instruction may be valid
|
||||
for one program, and invalid for another. For instance, \\(\\texttt{jmp} \\; 100\\)
|
||||
is perfectly valid for a program with thousands of instructions, but if it occurs
|
||||
in a program with only 3 instructions, it will certainly lead to disaster. Specifically,
|
||||
the validity of an instruction depends on the length of the program in which it resides,
|
||||
and the program counter at which it's encountered.
|
||||
Thus, we refine our idea of validity to "being valid for a program of length n at program counter f".
|
||||
For this, we can use the following two inference rules:
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{c : \text{Fin} \; n}
|
||||
{\texttt{acc} \; t \; \text{valid for} \; n, c }
|
||||
\quad
|
||||
\frac
|
||||
{c : \text{Fin} \; n \quad o \in \{\texttt{nop}, \texttt{jmp}\} \quad J_v(c, t) = \text{Some} \; c' }
|
||||
{o \; t \; \text{valid for} \; n, c }
|
||||
{{< /latex >}}
|
||||
|
||||
The first rule states that if a program has length \\(n\\), then it's valid
|
||||
at any program counter whose value is less than \\(n\\). This is because running
|
||||
\\(\\texttt{add}\\) will increment the program counter \\(c\\) by 1,
|
||||
and thus, create a new program counter that's less than \\(n+1\\),
|
||||
which, as we discussed above, is perfectly valid.
|
||||
|
||||
The second rule works for the other two instructions. It has an extra premise:
|
||||
the result of `jump_valid_t` (written as \\(J_v\\)) has to be \\(\\text{Some} \\; c'\\),
|
||||
that is, `jump_valid_t` must succeed. Now, if an instruction satisfies these validity
|
||||
rules for a given program at a given program counter, evaluating it will always
|
||||
result in a program counter that has a proper value.
|
||||
|
||||
We encode this in Coq as follows:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 152 157 >}}
|
||||
|
||||
Note that we have three rules instead of two. This is because we "unfolded"
|
||||
\\(o\\) from our second rule: rather than using set notation (or "or"), we
|
||||
just generated two rules that vary in nothing but the operation involved.
|
||||
|
||||
Of course, we must have that every instruction in a program is valid.
|
||||
We don't really need inference rules for this, as much as a "forall" quantifier.
|
||||
A program \\(p\\) of length \\(n\\) is valid if the following holds:
|
||||
|
||||
{{< latex >}}
|
||||
\forall (c : \text{Fin} \; n). p[c] \; \text{valid for} \; n, c
|
||||
{{< /latex >}}
|
||||
|
||||
That is, for every possible in-bounds program counter \\(c\\),
|
||||
the instruction at the program counter is valid. We can now
|
||||
encode this in Coq, too:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 160 161 >}}
|
||||
|
||||
In the above, we use `input n` to mean "a program of length `n`".
|
||||
This is just an alias for `vect inst n`, a vector of instructions
|
||||
of length `n`. In the above, `n` is made implicit where possible.
|
||||
Since \\(c\\) (called `pc` in the code) is of type \\(\\text{Fin} \\; n\\), there's no
|
||||
need to write \\(n\\) _again_.
|
||||
|
||||
Finally, it's time to get started on the semantics themselves.
|
||||
We start with the inductive definition of \\((\\rightarrow_i)\\).
|
||||
I think this is fairly straightforward. We use
|
||||
`t` instead of \\(n\\) from the rules, and we use `FS`
|
||||
instead of \\(+1\\). Also, we make the formerly implicit
|
||||
assumption that \\(c+n\\) is valid explicit, by
|
||||
providing a proof that `valid_jump_t pc t = Some pc'`.
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}}
|
||||
|
||||
Next, it will help us to combine the premises for a
|
||||
"failed" and "ok" terminations into Coq data types.
|
||||
This will help us formulate a lemma later on. Here they are:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 112 117 >}}
|
||||
|
||||
Since all of out "termination" rules start and
|
||||
end in the same state, there's no reason to
|
||||
write that state twice. Thus, both `done`
|
||||
and `stuck` only take the input `inp`,
|
||||
and the state, which includes the accumulator
|
||||
`acc`, set of allowed program counters `v`, and
|
||||
the program counter at which the program came to an end.
|
||||
When the program terminates successfully, this program
|
||||
counter will be equal to the length of the program `n`,
|
||||
so we use `nat_to_fin n`. On the other hand, if the program
|
||||
terminates in as stuck state, it must be that it terminated
|
||||
at a program counter that points to an instruction. Thus, this
|
||||
program counter is actually a \\(\\text{Fin} \\; n\\), and not
|
||||
a \\(\\text{Fin} \\ (n+1)\\) (we use the same "weakening" trick
|
||||
we saw earlier), and is not in the set of allowed program counters.
|
||||
|
||||
Finally, we encode the three inference rules we came up with:
|
||||
|
||||
{{< codelines "Coq" "aoc-2020/day8.v" 119 126 >}}
|
||||
|
Loading…
Reference in New Issue
Block a user