|
|
@ -249,6 +249,9 @@ the program terminates in an "ok" state. Here's a rule for terminating in the "o |
|
|
|
{(c, a, v) \Rightarrow_{p} (c, a, v)} |
|
|
|
{{< /latex >}} |
|
|
|
|
|
|
|
{{< todo >}} |
|
|
|
We can make this closer to the Coq version. |
|
|
|
{{< /todo >}} |
|
|
|
When our program counter reaches the end of the program, we are also done evaluating it. Even though |
|
|
|
both rules {{< sidenote "right" "redundant-note" "lead to the same conclusion," >}} |
|
|
|
In fact, if the end of the program is never included in the valid set, the second rule is completely redundant. |
|
|
@ -284,7 +287,7 @@ case - it only says an instruction shouldn't be run twice. The "valid set", alth |
|
|
|
this debate, is our invention, and isn't part of the original specification. |
|
|
|
|
|
|
|
There is, however, something we can infer from this problem. Since the problem of jumping "too far behind" or |
|
|
|
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to a valid instruction, |
|
|
|
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to an instruction, |
|
|
|
or right to the end of a program_. This means that \\(c\\) is a natural number, with |
|
|
|
|
|
|
|
{{< latex >}} |
|
|
@ -355,3 +358,156 @@ If we were to access its elements by indices starting at 0, we'd be allowed to a |
|
|
|
These are precisely the values of the finite natural numbers less than \\(n\\), \\(\\text{Fin} \\; n \\). |
|
|
|
Thus, given such an index \\(\\text{Fin} \\; n\\) and a vector \\(\\text{Vec} \\; t \\; n\\), we are guaranteed |
|
|
|
to be able to retrieve the element at the given index! In our code, we will not have to worry about bounds checking. |
|
|
|
|
|
|
|
Of course, if our program has \\(n\\) elements, our program counter will be a finite number less than \\(n+1\\), |
|
|
|
since there's always the possibility of it pointing past the instructions, indicating that we've finished |
|
|
|
running the program. This leads to some minor complications: we can't safely access the program instruction |
|
|
|
at index \\(\\text{Fin} \\; (n+1)\\). We can solve this problem by considering two cases: |
|
|
|
either our index points one past the end of the program (in which case its value is exactly the finite |
|
|
|
representation of \\(n\\)), or it's less than \\(n\\), in which case we can "tighten" the upper bound, |
|
|
|
and convert that index into a \\(\\text{Fin} \\; n\\). We formalize it in a lemma: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}} |
|
|
|
|
|
|
|
{{< todo >}}Prove this (at least informally) {{< /todo >}} |
|
|
|
|
|
|
|
There's a little bit of a gotcha here. Instead of translating our above statement literally, |
|
|
|
and returning a value that's the result of "tightening" our input `f`, we return a value |
|
|
|
`f'` that can be "weakened" to `f`. This is because "tightening" is not a total function - |
|
|
|
it's not always possible to convert a \\(\\text{Fin} \\; (n+1)\\) into a \\(\\text{Fin} \\; n\\). |
|
|
|
However, "weakening" \\(\\text{Fin} \\; n\\) _is_ a total function, since a number less than \\(n\\) |
|
|
|
is, by the transitive property of a total order, also less than \\(n+1\\). |
|
|
|
|
|
|
|
Next, let's talk about addition, specifically the kind of addition done by the \\(\\texttt{jmp}\\) instruction. |
|
|
|
We can always add an integer to a natural number, but we can at best guarantee that the result |
|
|
|
will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural |
|
|
|
number. We implement this kind of addition in a function called `jump_t`: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 56 56 >}} |
|
|
|
|
|
|
|
At the moment, its definition is not particularly important. What is important, though, |
|
|
|
is that it takes a bounded natural number `pc` (our program counter), an integer `off` |
|
|
|
(the offset provided by the jump instruction) and returns another integer representing |
|
|
|
the final offset. Why are integers of type `t`? Well, it so happens |
|
|
|
that Coq provides facilities for working with arbitrary implementations of integers, |
|
|
|
without relying on how they are implemented under the hood. This can be seen in its |
|
|
|
[`Coq.ZArith.Int`](https://coq.inria.fr/library/Coq.ZArith.Int.html) module, |
|
|
|
which describes what functions and types an implementation of integers should provide. |
|
|
|
Among those is `t`, the type an integer in such an arbitrary implementation. We too |
|
|
|
will not make an assumption about how the integers are implemented, and simply |
|
|
|
use this generic `t` from now on. |
|
|
|
|
|
|
|
#### Semantics in Coq |
|
|
|
|
|
|
|
Now that we've seen finite sets and vectors, it's time to use them to |
|
|
|
encode our semantics in Coq. Let's start with jumps. Suppose we wanted to write a function that _does_ return a valid program |
|
|
|
counter after adding the offset to it. Since it's possible for this function to fail |
|
|
|
(for instance, if the offset is very negative), it has to return `option (fin (S n))`. |
|
|
|
That is, this function may either fail (returning `None`) or succeed, returning |
|
|
|
`Some f`, where `f` is of type `fin (S n)`, aka \\(\\text{Fin} \\; (n + 1)\\). Here's |
|
|
|
the function in Coq (again, don't worry too much about the definition): |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 61 61 >}} |
|
|
|
|
|
|
|
But earlier, didn't we say: |
|
|
|
|
|
|
|
> All jumps will lead either to an instruction, or right to the end of a program. |
|
|
|
|
|
|
|
To make Coq aware of this constraint, we'll have to formalize this notion. To |
|
|
|
start off, we'll define the notion of a "valid instruction", which is guaranteed |
|
|
|
to keep the program counter in the correct range. |
|
|
|
There are a couple of ways to do this, but we'll use yet another definition based |
|
|
|
on inference rules. First, though, observe that the same instruction may be valid |
|
|
|
for one program, and invalid for another. For instance, \\(\\texttt{jmp} \\; 100\\) |
|
|
|
is perfectly valid for a program with thousands of instructions, but if it occurs |
|
|
|
in a program with only 3 instructions, it will certainly lead to disaster. Specifically, |
|
|
|
the validity of an instruction depends on the length of the program in which it resides, |
|
|
|
and the program counter at which it's encountered. |
|
|
|
Thus, we refine our idea of validity to "being valid for a program of length n at program counter f". |
|
|
|
For this, we can use the following two inference rules: |
|
|
|
|
|
|
|
{{< latex >}} |
|
|
|
\frac |
|
|
|
{c : \text{Fin} \; n} |
|
|
|
{\texttt{acc} \; t \; \text{valid for} \; n, c } |
|
|
|
\quad |
|
|
|
\frac |
|
|
|
{c : \text{Fin} \; n \quad o \in \{\texttt{nop}, \texttt{jmp}\} \quad J_v(c, t) = \text{Some} \; c' } |
|
|
|
{o \; t \; \text{valid for} \; n, c } |
|
|
|
{{< /latex >}} |
|
|
|
|
|
|
|
The first rule states that if a program has length \\(n\\), then it's valid |
|
|
|
at any program counter whose value is less than \\(n\\). This is because running |
|
|
|
\\(\\texttt{add}\\) will increment the program counter \\(c\\) by 1, |
|
|
|
and thus, create a new program counter that's less than \\(n+1\\), |
|
|
|
which, as we discussed above, is perfectly valid. |
|
|
|
|
|
|
|
The second rule works for the other two instructions. It has an extra premise: |
|
|
|
the result of `jump_valid_t` (written as \\(J_v\\)) has to be \\(\\text{Some} \\; c'\\), |
|
|
|
that is, `jump_valid_t` must succeed. Now, if an instruction satisfies these validity |
|
|
|
rules for a given program at a given program counter, evaluating it will always |
|
|
|
result in a program counter that has a proper value. |
|
|
|
|
|
|
|
We encode this in Coq as follows: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 152 157 >}} |
|
|
|
|
|
|
|
Note that we have three rules instead of two. This is because we "unfolded" |
|
|
|
\\(o\\) from our second rule: rather than using set notation (or "or"), we |
|
|
|
just generated two rules that vary in nothing but the operation involved. |
|
|
|
|
|
|
|
Of course, we must have that every instruction in a program is valid. |
|
|
|
We don't really need inference rules for this, as much as a "forall" quantifier. |
|
|
|
A program \\(p\\) of length \\(n\\) is valid if the following holds: |
|
|
|
|
|
|
|
{{< latex >}} |
|
|
|
\forall (c : \text{Fin} \; n). p[c] \; \text{valid for} \; n, c |
|
|
|
{{< /latex >}} |
|
|
|
|
|
|
|
That is, for every possible in-bounds program counter \\(c\\), |
|
|
|
the instruction at the program counter is valid. We can now |
|
|
|
encode this in Coq, too: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 160 161 >}} |
|
|
|
|
|
|
|
In the above, we use `input n` to mean "a program of length `n`". |
|
|
|
This is just an alias for `vect inst n`, a vector of instructions |
|
|
|
of length `n`. In the above, `n` is made implicit where possible. |
|
|
|
Since \\(c\\) (called `pc` in the code) is of type \\(\\text{Fin} \\; n\\), there's no |
|
|
|
need to write \\(n\\) _again_. |
|
|
|
|
|
|
|
Finally, it's time to get started on the semantics themselves. |
|
|
|
We start with the inductive definition of \\((\\rightarrow_i)\\). |
|
|
|
I think this is fairly straightforward. We use |
|
|
|
`t` instead of \\(n\\) from the rules, and we use `FS` |
|
|
|
instead of \\(+1\\). Also, we make the formerly implicit |
|
|
|
assumption that \\(c+n\\) is valid explicit, by |
|
|
|
providing a proof that `valid_jump_t pc t = Some pc'`. |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}} |
|
|
|
|
|
|
|
Next, it will help us to combine the premises for a |
|
|
|
"failed" and "ok" terminations into Coq data types. |
|
|
|
This will help us formulate a lemma later on. Here they are: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 112 117 >}} |
|
|
|
|
|
|
|
Since all of out "termination" rules start and |
|
|
|
end in the same state, there's no reason to |
|
|
|
write that state twice. Thus, both `done` |
|
|
|
and `stuck` only take the input `inp`, |
|
|
|
and the state, which includes the accumulator |
|
|
|
`acc`, set of allowed program counters `v`, and |
|
|
|
the program counter at which the program came to an end. |
|
|
|
When the program terminates successfully, this program |
|
|
|
counter will be equal to the length of the program `n`, |
|
|
|
so we use `nat_to_fin n`. On the other hand, if the program |
|
|
|
terminates in as stuck state, it must be that it terminated |
|
|
|
at a program counter that points to an instruction. Thus, this |
|
|
|
program counter is actually a \\(\\text{Fin} \\; n\\), and not |
|
|
|
a \\(\\text{Fin} \\ (n+1)\\) (we use the same "weakening" trick |
|
|
|
we saw earlier), and is not in the set of allowed program counters. |
|
|
|
|
|
|
|
Finally, we encode the three inference rules we came up with: |
|
|
|
|
|
|
|
{{< codelines "Coq" "aoc-2020/day8.v" 119 126 >}} |
|
|
|