Finish up draft of Coq post.

2021-01-10 22:48:22 -08:00 · 2021-01-10 22:48:22 -08:00 · dcb1e9a736
commit dcb1e9a736
parent c8543961af
1 changed files with 91 additions and 23 deletions
--- a/content/blog/01_aoc_coq.md
+++ b/content/blog/01_aoc_coq.md
@ -39,7 +39,7 @@ we go about using _inference rules_. Let's talk about those next.

 #### Inference Rules
 Inference rules are a very general notion. The describe how we can determine (infer) a conclusion
-from a set of assumption. It helps to look at an example. Here's a silly little inference rule:
+from a set of assumptions. It helps to look at an example. Here's a silly little inference rule:

 {{< latex >}}
 \frac
@ -181,7 +181,7 @@ it will add 0 to the accumulator (keeping it the same),
 do nothing, and finally jump back to the beginning. At this point, it will try to run the addition instruction again,
 which is not allowed; thus, the program will terminate.

-Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote \\(p\\)).
+Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote by \\(p\\)).
 * First, to evaluate the program we will need a program counter, \\(\\textit{c}\\). This program counter
 will tell us the position of the instruction to be executed next. It can also point past the last instruction,
 which means our program terminated successfully. 
@ -242,16 +242,30 @@ is done evaluating, and is in a "failed" state.

 We use \\(\\text{length}(p)\\) to represent the number of instructions in \\(p\\). Note the second premise:
 even if our program counter \\(c\\) is not included in the valid set, if it's "past the end of the program",
-the program terminates in an "ok" state. Here's a rule for terminating in the "ok" state:
+the program terminates in an "ok" state.
+{{< sidenote "left" "avoid-c-note" "Here's a rule for terminating in the \"ok\" state:" >}}
+In the presented rule, we don't use the variable <code>c</code> all that much, and we know its concrete
+value (from the equality premise). We could thus avoid introducing the name \(c\) by
+replacing it with said known value:
+
+{{< latex >}}
+\frac{}
+{(\text{length}(p), a, v) \Rightarrow_{p} (\text{length}(p), a, v)}
+{{< /latex >}}
+
+This introduces some duplication, but that is really because all "base case" evaluation rules
+start and stop in the same state. To work around this, we could define a separate proposition
+to mean "program \(p\) is done in state \(s\)", then \(s\) will really only need to occur once,
+and so will \(\text{length}(p)\). This is, in fact, what we will do later on,
+since being able to talk abut "programs being done" will help us with
+components of our proof.
+{{< /sidenote >}}

 {{< latex >}}
 \frac{c = \text{length}(p)}
 {(c, a, v) \Rightarrow_{p} (c, a, v)}
 {{< /latex >}}

-{{< todo >}}
-We can make this closer to the Coq version.
-{{< /todo >}}
 When our program counter reaches the end of the program, we are also done evaluating it. Even though
 both rules {{< sidenote "right" "redundant-note" "lead to the same conclusion," >}}
 In fact, if the end of the program is never included in the valid set, the second rule is completely redundant.
@ -350,8 +364,8 @@ Inductive t A : nat -> Type :=
 ```

 The `nil` constructor represents the empty list \\([]\\), and `cons` represents
-the operation of prepending an element (called `h` in the code in \\(x\\) in our inference rules)
-to another vector of length \\(n\\), which is remains unnamed in the code but is called \\(\\textit{xs}\\) in our rules.
+the operation of prepending an element (called `h` in the code and \\(x\\) in our inference rules)
+to another vector of length \\(n\\), which remains unnamed in the code but is called \\(\\textit{xs}\\) in our rules.

 These two definitions work together quite well. For instance, suppose we have a vector of length \\(n\\).
 If we were to access its elements by indices starting at 0, we'd be allowed to access indices 0 through \\(n-1\\).
@ -369,8 +383,6 @@ and convert that index into a \\(\\text{Fin} \\; n\\). We formalize it in a lemm

 {{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}}

-{{< todo >}}Prove this (at least informally) {{< /todo >}}
-
 There's a little bit of a gotcha here. Instead of translating our above statement literally,
 and returning a value that's the result of "tightening" our input `f`, we return a value
 `f'` that can be "weakened" to `f`. This is because "tightening" is not a total function - 
@ -378,6 +390,29 @@ it's not always possible to convert a \\(\\text{Fin} \\; (n+1)\\) into a \\(\\te
 However, "weakening" \\(\\text{Fin} \\; n\\) _is_ a total function, since a number less than \\(n\\)
 is, by the transitive property of a total order, also less than \\(n+1\\).

+The Coq proof for this claim is as follows:
+
+{{< codelines "Coq" "aoc-2020/day8.v" 88 97 >}}
+
+The `Fin.rectS` function is a convenient way to perform inductive proofs over
+our finite natural numbers. Informally, our proof proceeds as follows:
+
+* If the current finite natural number is zero, take a look at the "bound" (which
+we assume is nonzero, since there isn't a natural number less than zero).
+    * If this "bounding number" is one, our `f` can't be tightened any further,
+    since doing so would create a number less than zero. Fortunately, in this case,
+    `n` must be `0`, so `f` is the finite representation of `n`.
+    * Otherwise, `f` is most definitely a weakened version of another `f'`,
+    since the tightest possible type for zero has a "bounding number" of one, and
+    our "bounding number" is greater than that. We return a tighter version of our finite zero.
+* If our number is a successor of another finite number, we check if that other number
+can itself be tightened.
+    * If it can't be tightened, then our smaller number is a finite representation of
+    `n-1`. This, in turn, means that adding one to it will be the finite representation
+    of `n` (if \\(x\\) is equal to \\(n-1\\), then \\(x+1\\) is equal to \\(n\\)).
+    * If it _can_ be tightened, then so can the successor (if \\(x\\) is less
+    than \\(n-1\\), then \\(x+1\\) is less than \\(n\\)).
+
 Next, let's talk about addition, specifically the kind of addition done by the \\(\\texttt{jmp}\\) instruction.
 We can always add an integer to a natural number, but we can at best guarantee that the result
 will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural
@ -393,7 +428,7 @@ that Coq provides facilities for working with arbitrary implementations of integ
 without relying on how they are implemented under the hood. This can be seen in its
 [`Coq.ZArith.Int`](https://coq.inria.fr/library/Coq.ZArith.Int.html) module,
 which describes what functions and types an implementation of integers should provide.
-Among those is `t`, the type an integer in such an arbitrary implementation. We too
+Among those is `t`, the type of an integer in such an arbitrary implementation. We too
 will not make an assumption about how the integers are implemented, and simply
 use this generic `t` from now on.

@ -453,7 +488,7 @@ providing a proof that `valid_jump_t pc t = Some pc'`.

 {{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}}

-Next, it will help us to combine the premises for a
+Next, it will help us to combine the premises for
 "failed" and "ok" terminations into Coq data types.
 This will make it easier for us to formulate a lemma later on.
 Here are the definitions:
@ -465,7 +500,7 @@ end in the same state, there's no reason to
 write that state twice. Thus, both `done`
 and `stuck` only take the input `inp`,
 and the state, which includes the accumulator
-`acc`, set of allowed program counters `v`, and
+`acc`, the set of allowed program counters `v`, and
 the program counter at which the program came to an end.
 When the program terminates successfully, this program
 counter will be equal to the length of the program `n`,
@ -483,7 +518,7 @@ Finally, we encode the three inference rules we came up with:

 Notice that we fused two of the premises in the last rule.
 Instead of naming the instruction at the current program
-counter and using it in another premise, we simply use
+counter (by writing \\(p[c] = i\\)) and using it in another premise, we simply use
 `nth inp pc`, which corresponds to \\(p[c]\\) in our
 "paper" semantics.

@ -508,14 +543,14 @@ For this, we can use the following two inference rules:
 {{< latex >}}
 \frac
 {c : \text{Fin} \; n}
-{\texttt{acc} \; t \; \text{valid for} \; n, c }
+{\texttt{add} \; t \; \text{valid for} \; n, c }
 \quad
 \frac
 {c : \text{Fin} \; n \quad o \in \{\texttt{nop}, \texttt{jmp}\} \quad J_v(c, t) = \text{Some} \; c' }
 {o \; t \; \text{valid for} \; n, c }
 {{< /latex >}}

-The first rule states that if a program has length \\(n\\), then it's valid
+The first rule states that if a program has length \\(n\\), then \\(\\texttt{add}\\) is valid
 at any program counter whose value is less than \\(n\\). This is because running
 \\(\\texttt{add}\\) will increment the program counter \\(c\\) by 1,
 and thus, create a new program counter that's less than \\(n+1\\),
@ -524,7 +559,7 @@ which, as we discussed above, is perfectly valid.
 The second rule works for the other two instructions. It has an extra premise:
 the result of `jump_valid_t` (written as \\(J_v\\)) has to be \\(\\text{Some} \\; c'\\),
 that is, `jump_valid_t` must succeed. Note that we require this even for no-ops,
-since it later turns out of the them may be a jump after all.
+since it later turns out that one of the them may be a jump after all.

 We now have our validity rules. If an instruction satisfies them for a given program
 and at a given program counter, evaluating it will always result in a program counter that has a proper value.
@ -576,7 +611,7 @@ available to all of the proofs we write in this section.
 The first proof is rather simple. The claim is:

 > For our valid program, at any program counter `pc`
-and accumulator `acc`, there must exists another program
+and accumulator `acc`, there must exist another program
 counter `pc'` and accumulator `acc'` such that the
 instruction evaluation relation \\((\rightarrow_i)\\)
 connects the two. That is, valid addresses aside,
@ -676,7 +711,7 @@ That is, `(jmp, t0)` is a valid instruction at `pc`. Then, using
 Coq's `inversion` tactic, we ask: how is this possible? There is
 only one inference rule that gives us such a conclusion, and it is named `valid_inst_jmp`
 in our Coq code. Since we have a proof that our `jmp` is valid,
-it must mean that this rule was used. Furthermore, sicne this
+it must mean that this rule was used. Furthermore, since this
 rule requires that `valid_jump_t` evaluates to `Some f'`, we know
 that this must be the case here! Coq now has adds the following
 two lines to our proof state:
@ -820,7 +855,7 @@ are fairly trivial:
 {{< codelines "Coq" "aoc-2020/day8.v" 237 240 >}}

 We basically connect the dots between the premises (in a form like `done`)
-and the corresponding inference rule (`run_noswap_done`). The more
+and the corresponding inference rule (`run_noswap_ok`). The more
 interesting case is when we can take a step.

 {{< codelines "Coq" "aoc-2020/day8.v" 241 253 >}}
@ -864,10 +899,43 @@ this proof will __return to us the final program counter and accumulator!__
 This is precisely what we'd need to solve part 1.

 But wait, almost? What's missing? We're missing a few implementation details:
-* We've not provided a concrete impelmentation of integers.
+* We've not provided a concrete impelmentation of integers. The simplest
+thing to do here would be to use [`Coq.ZArith.BinInt`](https://coq.inria.fr/library/Coq.ZArith.BinInt.html),
+for which there is a module [`Z_as_Int`](https://coq.inria.fr/library/Coq.ZArith.Int.html#Z_as_Int)
+that provides `t` and friends.
 * We assumed (reasonably, I would say) that it's possible to convert a natural
-number to an integer.
+number to an integer. If we're using the aforementioned `BinInt` module,
+we can use [`Z.of_nat`](https://coq.inria.fr/library/Coq.ZArith.BinIntDef.html#Z.of_nat).
 * We also assumed (still reasonably) that we can try convert an integer
 back to a finite natural number, failing if it's too small or too large.
+There's no built-in function for this, but `Z`, for one, distinguishes
+between the "positive", "zero", and "negative" cases, and we have
+`Pos.to_nat` for the positive case.

-{{< todo >}}Finish up{{< /todo >}}
+Well, I seem to have covered all the implementation details. Why not just
+go ahead and solve the problem? I tried, and ran into two issues:
+
+* Although this is "given", we assumed that our input program will be
+valid. For us to use the result of our Coq proof, we need to provide it
+a constructive proof that our program is valid. Creating this proof is tedious
+in theory, and quite difficult in practice: I've run into a
+strange issue trying to pattern match on finite naturals.
+* Even supposing we _do_ have a proof of validity, I'm not certain
+if it's possible to actually extract an answer from it. It seems
+that Coq distinguishes between proofs (things of type `Prop`) and
+values (things of type `Set`). things of types `Prop` are supposed
+to be _erased_. This means that when you convert Coq code,
+to, say, Haskell, you will see no trace of any `Prop`s in that generated
+code. Unfortunately, this also means we
+[can't use our proofs to construct values](https://stackoverflow.com/questions/27322979/why-coq-doesnt-allow-inversion-destruct-etc-when-the-goal-is-a-type),
+even though our proof objects do indeed contain them.
+
+So, we "theoretically" have a solution to part 1, down to the algorithm
+used to compute it and a proof that our algorithm works. In "reality", though, we
+can't actually use this solution to procure an answer. Like we did with day 1, we'll have
+to settle for only a proof.
+
+Let's wrap up for this post. It would be more interesting to devise and
+formally verify an algorithm for part 2, but this post has already gotten
+quite long and contains a lot of information. Perhaps I will revisit this
+at a later time. Thanks for reading!