Get started on a post about a UCC evaluator
This commit is contained in:
parent
30c395151d
commit
6c1940f5d2
274
content/blog/coq_dawn_eval.md
Normal file
274
content/blog/coq_dawn_eval.md
Normal file
|
@ -0,0 +1,274 @@
|
||||||
|
---
|
||||||
|
title: "A Verified Evaluator for the Untyped Concatenative Calculus"
|
||||||
|
date: 2021-11-27T20:24:57-08:00
|
||||||
|
draft: true
|
||||||
|
tags: ["Dawn", "Coq", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
Earlier, I wrote [an article]({{< relref "./coq_dawn" >}}) in which I used Coq to
|
||||||
|
encode the formal semantics of [Dawn's Untyped Concatenative Calculus](https://www.dawn-lang.org/posts/foundations-ucc/),
|
||||||
|
and to prove a few thing about the mini-language. Though I'm quite happy with how that turned out,
|
||||||
|
my article was missing something that's present on the original Dawn page -- an evaluator. In this article, I'll define
|
||||||
|
an evaluator function in Coq, prove that it's equivalent to Dawn's formal semantics,
|
||||||
|
and extract all of this into usable Haskell code.
|
||||||
|
|
||||||
|
### Changes Since Last Time
|
||||||
|
In trying to write and verify this evaluator, I decided to make changes to the way I formalized
|
||||||
|
the UCC. Remember how we used a predicate, `IsValue`, to tag expressions that were values?
|
||||||
|
It turns out that this is a very cumbersome approach. For one thing, this formalization
|
||||||
|
allows for the case in which the exact same expression is a value for two different
|
||||||
|
reasons. Although `IsValue` has only one constructor (`Val_quote`), it's actually
|
||||||
|
{{< sidenote "right" "hott-note" "not provable" >}}
|
||||||
|
Interestingly, it's also not provable that any two proofs of \(a = b\) are equal,
|
||||||
|
even though equality only has one constructor, \(\text{eq_refl}\ :\ a \rightarrow (a = a) \).
|
||||||
|
Under the <a href="https://homotopytypetheory.org/book/">homotopic interpretation</a>
|
||||||
|
of type theory, this corresponds to the fact that two paths from \(a\) to \(b\) need
|
||||||
|
not be homotopic (continuously deformable) to each other.<br>
|
||||||
|
<br>
|
||||||
|
As an intuitive example, imagine wrapping a string around a pole. Holding the ends of
|
||||||
|
the string in place, there's nothing you can do to "unwrap" the string. This string
|
||||||
|
is thus not deformable into a string that starts and stops at the same points,
|
||||||
|
but does not go around the pole.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
that any two proofs of `IsValue e` are equal. I ended up getting into a lot of losing
|
||||||
|
arguments with the Coq runtime about whether or not two stacks are actually the same.
|
||||||
|
Also, some of the semantic rules expected a value on the stack with a particular proof
|
||||||
|
for `IsValue`, and rejected the exact same stack with a generic value.
|
||||||
|
|
||||||
|
Thus, I switched from our old implementation:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/Dawn.v" 19 22 >}}
|
||||||
|
|
||||||
|
To the one I originally had in mind:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnV2.v" 19 19 >}}
|
||||||
|
|
||||||
|
I then had the following function to convert a value back into an equivalent expression:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnV2.v" 22 25 >}}
|
||||||
|
|
||||||
|
I replaced instances of `projT1` with instances of `value_to_expr`.
|
||||||
|
|
||||||
|
### Where We Are
|
||||||
|
At the end of my previous article, we ended up with a Coq encoding of the big-step
|
||||||
|
[operational semantics](https://en.wikipedia.org/wiki/Operational_semantics)
|
||||||
|
of UCC, as well as some proofs of correctness about the derived forms from
|
||||||
|
the article (like \\(\\text{quote}_3\\) and \\(\\text{rotate}_3\\)). The trouble
|
||||||
|
is, despite having our operational semantics, we can't make our Coq
|
||||||
|
code run anything. This is for several reasons:
|
||||||
|
|
||||||
|
1. Our definitions only let us to _confirm_ that given some
|
||||||
|
initial stack, a program ends up in some other final stack. We even have a
|
||||||
|
little `Ltac2` tactic to help us automate this kind of proof. However, in an evaluator,
|
||||||
|
the final stack is not known until the program finishes running. We can't
|
||||||
|
confirm the result of evaluation, we need to _find_ it.
|
||||||
|
2. To address the first point, we could try write a function that takes a program
|
||||||
|
and an initial stack, and produces a final stack, as well as a proof that
|
||||||
|
the program would evaluate to this stack under our semantics. However,
|
||||||
|
it's quite easy to write a non-terminating UCC program, whereas functions
|
||||||
|
in Coq _have to terminate!_ We can't write a terminating function to
|
||||||
|
run non-terminating code.
|
||||||
|
|
||||||
|
So, is there anything we can do? No, there isn't. Writing an evaluator in Coq
|
||||||
|
is just not possible. The end, thank you for reading.
|
||||||
|
|
||||||
|
Just kidding -- there's definitely a way to get our code evaluating, but it
|
||||||
|
will look a little bit strange.
|
||||||
|
|
||||||
|
### A Step-by-Step Evaluator
|
||||||
|
The trick is to recognize that program evaluation
|
||||||
|
occurs in steps. There may well be an infinite number of steps, if the program
|
||||||
|
is non-terminating, but there's always a step we can take. That is, unless
|
||||||
|
an invalid instruction is run, like trying to clone from an empty stack, or unless
|
||||||
|
the program finished running. You don't need a non-terminating function to just
|
||||||
|
give you a next step, if one exists. We can write such a function in Coq.
|
||||||
|
|
||||||
|
Here's a new data type that encodes the three situations we mentioned in the
|
||||||
|
previous paragraph. Its constructors (one per case) are as follows:
|
||||||
|
|
||||||
|
1. `err` - there are no possible evaluation steps due to an error.
|
||||||
|
3. `middle e s` - the evaluation took a step; the stack changed to `s`, and the rest of the program is `e`.
|
||||||
|
2. `final s` - there are no possible evaluation steps because the evaluation is complete,
|
||||||
|
leaving a final stack `s`.
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 6 9 >}}
|
||||||
|
|
||||||
|
We can now write a function that tries to execute a single step given an expression.
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 11 27 >}}
|
||||||
|
|
||||||
|
Most intrinsics, by themselves, complete after just one step. For instance, a program
|
||||||
|
consisting solely of \\(\\text{swap}\\) will either fail (if the stack doesn't have enough
|
||||||
|
values), or it will swap the top two values and be done. We list only "correct" cases,
|
||||||
|
and resort to a "catch-all" case on line 26 that returns an error. The one multi-step
|
||||||
|
intrinsic is \\(\\text{apply}\\), which can evaluate an arbitrary expression from the stack.
|
||||||
|
In this case, the "one step" consists of popping the quoted value from the stack; the
|
||||||
|
"remaining program" is precisely the expression that was popped.
|
||||||
|
|
||||||
|
Quoting an expression also always completes in one step (it simply places the quoted
|
||||||
|
expression on the stack). The really interesting case is composition. Expressions
|
||||||
|
are evaluated left-to-right, so we first determine what kind of step the left expression (`e1`)
|
||||||
|
can take. We may need more than one step to finish up with `e1`, so there's a good chance it
|
||||||
|
returns a "rest program" `e1'` and a stack `vs'`. If this happens, to complete evaluation of
|
||||||
|
\\(e_1\\ e_2\\), we need to first finish evaluating \\(e_1'\\), and then evaluate \\(e_2\\).
|
||||||
|
Thus, the "rest of the program" is \\(e_1'\\ e_2\\), or `e_comp e1' e2`. On the other hand,
|
||||||
|
if `e1` finished evaluating, we still need to evaluate `e2`, so our "rest of the program"
|
||||||
|
is \\(e_2\\), or `e2`. If evaluating `e1` led to an error, then so did evaluating `e_comp e1 e2`,
|
||||||
|
and we return `err`.
|
||||||
|
|
||||||
|
### Extracting Code
|
||||||
|
Just knowing a single step is not enough to run the code. We need something that repeatedly
|
||||||
|
tries to take a step, as long as it's possible. However, this part is once again
|
||||||
|
not possible in Coq, as it brings back the possibility of non-termination. So if we can't use
|
||||||
|
Coq, why don't we use another language? Coq's extraction mechanism allows us to do just that.
|
||||||
|
|
||||||
|
I added the following code to the end of my file:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 219 223 >}}
|
||||||
|
|
||||||
|
Coq happily produces a new file, `UccGen.hs` with a lot of code. It's not exactly the most
|
||||||
|
aesthetic; here's a quick peek:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
data Intrinsic =
|
||||||
|
Swap
|
||||||
|
| Clone
|
||||||
|
| Drop
|
||||||
|
| Quote
|
||||||
|
| Compose
|
||||||
|
| Apply
|
||||||
|
|
||||||
|
data Expr =
|
||||||
|
E_int Intrinsic
|
||||||
|
| E_quote Expr
|
||||||
|
| E_comp Expr Expr
|
||||||
|
|
||||||
|
data Value =
|
||||||
|
V_quote Expr
|
||||||
|
|
||||||
|
-- ... more
|
||||||
|
```
|
||||||
|
|
||||||
|
All that's left is to make a new file, `Ucc.hs`. I use a different file so that
|
||||||
|
changes I make aren't overridden by Coq each time I run extraction. In this
|
||||||
|
file, we place the "glue code" that tries running one step after another:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/Ucc.hs" 46 51 >}}
|
||||||
|
|
||||||
|
Finally, loading up GHCi using `ghci Ucc.hs`, I can run the following commands:
|
||||||
|
|
||||||
|
```
|
||||||
|
ghci> fromList = foldl1 E_comp
|
||||||
|
ghci> test = eval [] $ fromList [true, false, UccGen.or]
|
||||||
|
ghci> :f test
|
||||||
|
test = Just [V_quote (E_comp (E_int Swap) (E_int Drop))]
|
||||||
|
```
|
||||||
|
|
||||||
|
That is, applying `or` to `true` and `false` results a stack with only `true` on top.
|
||||||
|
As expected, and proven by our semantics!
|
||||||
|
|
||||||
|
### Proving Equivalence
|
||||||
|
Okay, so `true false or` evaluates to `true`, much like our semantics claims.
|
||||||
|
However, does our evaluator _always_ match the semantics? So far, we have not
|
||||||
|
claimed or verified that it does. Let's try giving it a shot.
|
||||||
|
|
||||||
|
The first thing we can do is show that if we have a proof that `e` takes
|
||||||
|
initial stack `vs` to final stack `vs'`, then each
|
||||||
|
`eval_step` "makes progress" towards `vs'` (it doesn't simply _return_
|
||||||
|
`vs'`, since it only takes a single step and doesn't always complete
|
||||||
|
the evaluation). We also want to show that if the semanics dictates
|
||||||
|
`e` finishes in stack `vs'`, then `eval_step` will never return an error.
|
||||||
|
Thus, we have two possibilities:
|
||||||
|
|
||||||
|
* `eval_step` returns `final`. In this case, for it to match our semantics,
|
||||||
|
the final stack must be the same as `vs'`. Here's the relevant section
|
||||||
|
from the Coq file:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 30 30 >}}
|
||||||
|
* `eval_step` returns `middle`. In this case, the "rest of the program" needs
|
||||||
|
to evaluate to `vs'` according to our semantics (otherwise, taking a step
|
||||||
|
has changed the program's final outcome, which should not happen).
|
||||||
|
We need to quantify the new variables (specifically, the "rest of the
|
||||||
|
program", which we'll call `ei`, and the "stack after one step", `vsi`),
|
||||||
|
for which we use Coq's `exists` clause. The whole relevant statement is as
|
||||||
|
follows:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 31 33 >}}
|
||||||
|
|
||||||
|
The whole theorem claim is as follows:
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 29 33 >}}
|
||||||
|
|
||||||
|
I have the Coq proof script for this (in fact, you can click the link
|
||||||
|
at the top of the code block to view the original source file). However,
|
||||||
|
there's something unsatisfying about this statement. In particular,
|
||||||
|
how do we prove that an entire sequence of steps evaluates
|
||||||
|
to something? We'd have to examine the first step, checking if
|
||||||
|
it's a "final" step or a "middle" step; if it's a "middle" step,
|
||||||
|
we'd have to move on to the "rest of the program" and repeat the process.
|
||||||
|
Each time we'd have to "retrieve" `ei` and `vsi` from `eval_step_correct`,
|
||||||
|
and feed it back to `eval_step`.
|
||||||
|
|
||||||
|
I'll do you one better: how do we even _say_ that an expression "evaluates
|
||||||
|
step-by-step to final stack `vs'`"? For one step, we can say:
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
eval_step vs e = final vs'
|
||||||
|
```
|
||||||
|
|
||||||
|
For two steps, we'd have to assert the existence of an intermediate
|
||||||
|
expression (the "rest of the program" after the first step):
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
exists ei vsi,
|
||||||
|
eval_step vs e = middle ei vsi /\ (* First step to intermediate expression. *)
|
||||||
|
eval_step vsi ei = final vs' (* Second step to final state. *)
|
||||||
|
```
|
||||||
|
|
||||||
|
For three steps:
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
exists ei1 ei2 vsi1 vsi2,
|
||||||
|
eval_step vs e = middle ei1 vsi1 /\ (* First step to intermediate expression. *)
|
||||||
|
eval_step vsi1 ei1 = middle ei2 vsi2 /\ (* Second intermediate step *)
|
||||||
|
eval_step vsi2 ei2 = final vs' (* Second step to final state. *)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is awful! Not only is this a lot of writing, but it also makes various
|
||||||
|
sequences of steps have a different "shape". This way, we can't make
|
||||||
|
proofs about evalutions of an _arbitrary_ number of steps. Not all is lost, though: if we squint
|
||||||
|
a little at the last example (three steps), a pattern starts to emerge.
|
||||||
|
First, let's re-arrange the `exists` qualifiers:
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
exists ei1 vsi1, eval_step vs e = middle ei1 vsi1 /\ (* Cons *)
|
||||||
|
exists ei2 vsi2, eval_step vsi1 ei1 = middle ei2 vsi2 /\ (* Cons *)
|
||||||
|
eval_step vsi2 ei2 = final vs' (* Nil *)
|
||||||
|
```
|
||||||
|
|
||||||
|
If you squint at this, it kind of looks like a list! The "empty"
|
||||||
|
part of a list is the final step, while the "cons" part is a middle step. The
|
||||||
|
analogy holds up for another reason: an "empty" sequence has zero intermediate
|
||||||
|
expressions, while each "cons" introduces a single new intermediate
|
||||||
|
program. Perhaps we can define a new data type that matches this? Indeed
|
||||||
|
we can!
|
||||||
|
|
||||||
|
{{< codelines "Coq" "dawn/DawnEval.v" 64 67 >}}
|
||||||
|
|
||||||
|
The new data type is parameterized by the initial and final stacks, as well
|
||||||
|
as the expression that starts in the former and ends in the latter.
|
||||||
|
Then, consider the following _type_:
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
eval_chain nil (e_comp (e_comp true false) or) (true :: nil)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the type of sequences (or chains) of steps corresponding to the
|
||||||
|
evaluation of the program \\(\\text{true}\\ \\text{false}\\ \\text{or}\\),
|
||||||
|
starting in an empty stack and evaluating to a stack with only true on top.
|
||||||
|
Thus to say that an expression evaluates to some final stack `vs'`, in
|
||||||
|
_some unknown number of steps_, it's sufficient to write:
|
||||||
|
|
||||||
|
```Coq
|
||||||
|
eval_chain vs e vs'
|
||||||
|
```
|
Loading…
Reference in New Issue
Block a user