2021-11-28 01:09:26 -08:00
|
|
|
---
|
|
|
|
title: "A Verified Evaluator for the Untyped Concatenative Calculus"
|
|
|
|
date: 2021-11-27T20:24:57-08:00
|
|
|
|
draft: true
|
|
|
|
tags: ["Dawn", "Coq", "Programming Languages"]
|
|
|
|
---
|
|
|
|
|
|
|
|
Earlier, I wrote [an article]({{< relref "./coq_dawn" >}}) in which I used Coq to
|
|
|
|
encode the formal semantics of [Dawn's Untyped Concatenative Calculus](https://www.dawn-lang.org/posts/foundations-ucc/),
|
|
|
|
and to prove a few thing about the mini-language. Though I'm quite happy with how that turned out,
|
|
|
|
my article was missing something that's present on the original Dawn page -- an evaluator. In this article, I'll define
|
|
|
|
an evaluator function in Coq, prove that it's equivalent to Dawn's formal semantics,
|
|
|
|
and extract all of this into usable Haskell code.
|
|
|
|
|
|
|
|
### Changes Since Last Time
|
|
|
|
In trying to write and verify this evaluator, I decided to make changes to the way I formalized
|
|
|
|
the UCC. Remember how we used a predicate, `IsValue`, to tag expressions that were values?
|
|
|
|
It turns out that this is a very cumbersome approach. For one thing, this formalization
|
|
|
|
allows for the case in which the exact same expression is a value for two different
|
|
|
|
reasons. Although `IsValue` has only one constructor (`Val_quote`), it's actually
|
|
|
|
{{< sidenote "right" "hott-note" "not provable" >}}
|
|
|
|
Interestingly, it's also not provable that any two proofs of \(a = b\) are equal,
|
|
|
|
even though equality only has one constructor, \(\text{eq_refl}\ :\ a \rightarrow (a = a) \).
|
|
|
|
Under the <a href="https://homotopytypetheory.org/book/">homotopic interpretation</a>
|
|
|
|
of type theory, this corresponds to the fact that two paths from \(a\) to \(b\) need
|
|
|
|
not be homotopic (continuously deformable) to each other.<br>
|
|
|
|
<br>
|
|
|
|
As an intuitive example, imagine wrapping a string around a pole. Holding the ends of
|
|
|
|
the string in place, there's nothing you can do to "unwrap" the string. This string
|
|
|
|
is thus not deformable into a string that starts and stops at the same points,
|
|
|
|
but does not go around the pole.
|
|
|
|
{{< /sidenote >}}
|
|
|
|
that any two proofs of `IsValue e` are equal. I ended up getting into a lot of losing
|
|
|
|
arguments with the Coq runtime about whether or not two stacks are actually the same.
|
|
|
|
Also, some of the semantic rules expected a value on the stack with a particular proof
|
|
|
|
for `IsValue`, and rejected the exact same stack with a generic value.
|
|
|
|
|
|
|
|
Thus, I switched from our old implementation:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/Dawn.v" 19 22 >}}
|
|
|
|
|
|
|
|
To the one I originally had in mind:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnV2.v" 19 19 >}}
|
|
|
|
|
|
|
|
I then had the following function to convert a value back into an equivalent expression:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnV2.v" 22 25 >}}
|
|
|
|
|
|
|
|
I replaced instances of `projT1` with instances of `value_to_expr`.
|
|
|
|
|
|
|
|
### Where We Are
|
|
|
|
At the end of my previous article, we ended up with a Coq encoding of the big-step
|
|
|
|
[operational semantics](https://en.wikipedia.org/wiki/Operational_semantics)
|
|
|
|
of UCC, as well as some proofs of correctness about the derived forms from
|
|
|
|
the article (like \\(\\text{quote}_3\\) and \\(\\text{rotate}_3\\)). The trouble
|
|
|
|
is, despite having our operational semantics, we can't make our Coq
|
|
|
|
code run anything. This is for several reasons:
|
|
|
|
|
|
|
|
1. Our definitions only let us to _confirm_ that given some
|
|
|
|
initial stack, a program ends up in some other final stack. We even have a
|
|
|
|
little `Ltac2` tactic to help us automate this kind of proof. However, in an evaluator,
|
|
|
|
the final stack is not known until the program finishes running. We can't
|
|
|
|
confirm the result of evaluation, we need to _find_ it.
|
|
|
|
2. To address the first point, we could try write a function that takes a program
|
|
|
|
and an initial stack, and produces a final stack, as well as a proof that
|
|
|
|
the program would evaluate to this stack under our semantics. However,
|
|
|
|
it's quite easy to write a non-terminating UCC program, whereas functions
|
|
|
|
in Coq _have to terminate!_ We can't write a terminating function to
|
|
|
|
run non-terminating code.
|
|
|
|
|
|
|
|
So, is there anything we can do? No, there isn't. Writing an evaluator in Coq
|
|
|
|
is just not possible. The end, thank you for reading.
|
|
|
|
|
|
|
|
Just kidding -- there's definitely a way to get our code evaluating, but it
|
|
|
|
will look a little bit strange.
|
|
|
|
|
|
|
|
### A Step-by-Step Evaluator
|
|
|
|
The trick is to recognize that program evaluation
|
|
|
|
occurs in steps. There may well be an infinite number of steps, if the program
|
|
|
|
is non-terminating, but there's always a step we can take. That is, unless
|
|
|
|
an invalid instruction is run, like trying to clone from an empty stack, or unless
|
|
|
|
the program finished running. You don't need a non-terminating function to just
|
|
|
|
give you a next step, if one exists. We can write such a function in Coq.
|
|
|
|
|
|
|
|
Here's a new data type that encodes the three situations we mentioned in the
|
|
|
|
previous paragraph. Its constructors (one per case) are as follows:
|
|
|
|
|
|
|
|
1. `err` - there are no possible evaluation steps due to an error.
|
|
|
|
3. `middle e s` - the evaluation took a step; the stack changed to `s`, and the rest of the program is `e`.
|
|
|
|
2. `final s` - there are no possible evaluation steps because the evaluation is complete,
|
|
|
|
leaving a final stack `s`.
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 6 9 >}}
|
|
|
|
|
|
|
|
We can now write a function that tries to execute a single step given an expression.
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 11 27 >}}
|
|
|
|
|
|
|
|
Most intrinsics, by themselves, complete after just one step. For instance, a program
|
|
|
|
consisting solely of \\(\\text{swap}\\) will either fail (if the stack doesn't have enough
|
|
|
|
values), or it will swap the top two values and be done. We list only "correct" cases,
|
|
|
|
and resort to a "catch-all" case on line 26 that returns an error. The one multi-step
|
|
|
|
intrinsic is \\(\\text{apply}\\), which can evaluate an arbitrary expression from the stack.
|
|
|
|
In this case, the "one step" consists of popping the quoted value from the stack; the
|
|
|
|
"remaining program" is precisely the expression that was popped.
|
|
|
|
|
|
|
|
Quoting an expression also always completes in one step (it simply places the quoted
|
|
|
|
expression on the stack). The really interesting case is composition. Expressions
|
|
|
|
are evaluated left-to-right, so we first determine what kind of step the left expression (`e1`)
|
|
|
|
can take. We may need more than one step to finish up with `e1`, so there's a good chance it
|
|
|
|
returns a "rest program" `e1'` and a stack `vs'`. If this happens, to complete evaluation of
|
|
|
|
\\(e_1\\ e_2\\), we need to first finish evaluating \\(e_1'\\), and then evaluate \\(e_2\\).
|
|
|
|
Thus, the "rest of the program" is \\(e_1'\\ e_2\\), or `e_comp e1' e2`. On the other hand,
|
|
|
|
if `e1` finished evaluating, we still need to evaluate `e2`, so our "rest of the program"
|
|
|
|
is \\(e_2\\), or `e2`. If evaluating `e1` led to an error, then so did evaluating `e_comp e1 e2`,
|
|
|
|
and we return `err`.
|
|
|
|
|
|
|
|
### Extracting Code
|
|
|
|
Just knowing a single step is not enough to run the code. We need something that repeatedly
|
|
|
|
tries to take a step, as long as it's possible. However, this part is once again
|
|
|
|
not possible in Coq, as it brings back the possibility of non-termination. So if we can't use
|
|
|
|
Coq, why don't we use another language? Coq's extraction mechanism allows us to do just that.
|
|
|
|
|
|
|
|
I added the following code to the end of my file:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 219 223 >}}
|
|
|
|
|
|
|
|
Coq happily produces a new file, `UccGen.hs` with a lot of code. It's not exactly the most
|
|
|
|
aesthetic; here's a quick peek:
|
|
|
|
|
|
|
|
```Haskell
|
|
|
|
data Intrinsic =
|
|
|
|
Swap
|
|
|
|
| Clone
|
|
|
|
| Drop
|
|
|
|
| Quote
|
|
|
|
| Compose
|
|
|
|
| Apply
|
|
|
|
|
|
|
|
data Expr =
|
|
|
|
E_int Intrinsic
|
|
|
|
| E_quote Expr
|
|
|
|
| E_comp Expr Expr
|
|
|
|
|
|
|
|
data Value =
|
|
|
|
V_quote Expr
|
|
|
|
|
|
|
|
-- ... more
|
|
|
|
```
|
|
|
|
|
|
|
|
All that's left is to make a new file, `Ucc.hs`. I use a different file so that
|
|
|
|
changes I make aren't overridden by Coq each time I run extraction. In this
|
|
|
|
file, we place the "glue code" that tries running one step after another:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/Ucc.hs" 46 51 >}}
|
|
|
|
|
|
|
|
Finally, loading up GHCi using `ghci Ucc.hs`, I can run the following commands:
|
|
|
|
|
|
|
|
```
|
|
|
|
ghci> fromList = foldl1 E_comp
|
|
|
|
ghci> test = eval [] $ fromList [true, false, UccGen.or]
|
|
|
|
ghci> :f test
|
|
|
|
test = Just [V_quote (E_comp (E_int Swap) (E_int Drop))]
|
|
|
|
```
|
|
|
|
|
|
|
|
That is, applying `or` to `true` and `false` results a stack with only `true` on top.
|
|
|
|
As expected, and proven by our semantics!
|
|
|
|
|
|
|
|
### Proving Equivalence
|
|
|
|
Okay, so `true false or` evaluates to `true`, much like our semantics claims.
|
|
|
|
However, does our evaluator _always_ match the semantics? So far, we have not
|
|
|
|
claimed or verified that it does. Let's try giving it a shot.
|
|
|
|
|
|
|
|
The first thing we can do is show that if we have a proof that `e` takes
|
|
|
|
initial stack `vs` to final stack `vs'`, then each
|
|
|
|
`eval_step` "makes progress" towards `vs'` (it doesn't simply _return_
|
|
|
|
`vs'`, since it only takes a single step and doesn't always complete
|
|
|
|
the evaluation). We also want to show that if the semanics dictates
|
|
|
|
`e` finishes in stack `vs'`, then `eval_step` will never return an error.
|
|
|
|
Thus, we have two possibilities:
|
|
|
|
|
|
|
|
* `eval_step` returns `final`. In this case, for it to match our semantics,
|
|
|
|
the final stack must be the same as `vs'`. Here's the relevant section
|
|
|
|
from the Coq file:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 30 30 >}}
|
|
|
|
* `eval_step` returns `middle`. In this case, the "rest of the program" needs
|
|
|
|
to evaluate to `vs'` according to our semantics (otherwise, taking a step
|
|
|
|
has changed the program's final outcome, which should not happen).
|
|
|
|
We need to quantify the new variables (specifically, the "rest of the
|
|
|
|
program", which we'll call `ei`, and the "stack after one step", `vsi`),
|
|
|
|
for which we use Coq's `exists` clause. The whole relevant statement is as
|
|
|
|
follows:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 31 33 >}}
|
|
|
|
|
|
|
|
The whole theorem claim is as follows:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 29 33 >}}
|
|
|
|
|
|
|
|
I have the Coq proof script for this (in fact, you can click the link
|
|
|
|
at the top of the code block to view the original source file). However,
|
|
|
|
there's something unsatisfying about this statement. In particular,
|
|
|
|
how do we prove that an entire sequence of steps evaluates
|
|
|
|
to something? We'd have to examine the first step, checking if
|
|
|
|
it's a "final" step or a "middle" step; if it's a "middle" step,
|
|
|
|
we'd have to move on to the "rest of the program" and repeat the process.
|
|
|
|
Each time we'd have to "retrieve" `ei` and `vsi` from `eval_step_correct`,
|
|
|
|
and feed it back to `eval_step`.
|
|
|
|
|
|
|
|
I'll do you one better: how do we even _say_ that an expression "evaluates
|
|
|
|
step-by-step to final stack `vs'`"? For one step, we can say:
|
|
|
|
|
|
|
|
```Coq
|
|
|
|
eval_step vs e = final vs'
|
|
|
|
```
|
|
|
|
|
|
|
|
For two steps, we'd have to assert the existence of an intermediate
|
|
|
|
expression (the "rest of the program" after the first step):
|
|
|
|
|
|
|
|
```Coq
|
|
|
|
exists ei vsi,
|
|
|
|
eval_step vs e = middle ei vsi /\ (* First step to intermediate expression. *)
|
|
|
|
eval_step vsi ei = final vs' (* Second step to final state. *)
|
|
|
|
```
|
|
|
|
|
|
|
|
For three steps:
|
|
|
|
|
|
|
|
```Coq
|
|
|
|
exists ei1 ei2 vsi1 vsi2,
|
|
|
|
eval_step vs e = middle ei1 vsi1 /\ (* First step to intermediate expression. *)
|
|
|
|
eval_step vsi1 ei1 = middle ei2 vsi2 /\ (* Second intermediate step *)
|
|
|
|
eval_step vsi2 ei2 = final vs' (* Second step to final state. *)
|
|
|
|
```
|
|
|
|
|
|
|
|
This is awful! Not only is this a lot of writing, but it also makes various
|
|
|
|
sequences of steps have a different "shape". This way, we can't make
|
|
|
|
proofs about evalutions of an _arbitrary_ number of steps. Not all is lost, though: if we squint
|
|
|
|
a little at the last example (three steps), a pattern starts to emerge.
|
|
|
|
First, let's re-arrange the `exists` qualifiers:
|
|
|
|
|
|
|
|
```Coq
|
|
|
|
exists ei1 vsi1, eval_step vs e = middle ei1 vsi1 /\ (* Cons *)
|
|
|
|
exists ei2 vsi2, eval_step vsi1 ei1 = middle ei2 vsi2 /\ (* Cons *)
|
|
|
|
eval_step vsi2 ei2 = final vs' (* Nil *)
|
|
|
|
```
|
|
|
|
|
|
|
|
If you squint at this, it kind of looks like a list! The "empty"
|
|
|
|
part of a list is the final step, while the "cons" part is a middle step. The
|
|
|
|
analogy holds up for another reason: an "empty" sequence has zero intermediate
|
|
|
|
expressions, while each "cons" introduces a single new intermediate
|
|
|
|
program. Perhaps we can define a new data type that matches this? Indeed
|
|
|
|
we can!
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 64 67 >}}
|
|
|
|
|
|
|
|
The new data type is parameterized by the initial and final stacks, as well
|
|
|
|
as the expression that starts in the former and ends in the latter.
|
|
|
|
Then, consider the following _type_:
|
|
|
|
|
|
|
|
```Coq
|
|
|
|
eval_chain nil (e_comp (e_comp true false) or) (true :: nil)
|
|
|
|
```
|
|
|
|
|
|
|
|
This is the type of sequences (or chains) of steps corresponding to the
|
|
|
|
evaluation of the program \\(\\text{true}\\ \\text{false}\\ \\text{or}\\),
|
2021-11-28 01:48:01 -08:00
|
|
|
starting in an empty stack and evaluating to a stack with only
|
|
|
|
\\(\\text{true}\\) on top. Thus to say that an expression evaluates to some
|
|
|
|
final stack `vs'`, in _some unknown number of steps_, it's sufficient to write:
|
2021-11-28 01:09:26 -08:00
|
|
|
|
|
|
|
```Coq
|
|
|
|
eval_chain vs e vs'
|
|
|
|
```
|
2021-11-28 01:48:01 -08:00
|
|
|
|
|
|
|
Evaluation chains have a couple of interesting properties. First and foremost,
|
|
|
|
they can be "concatenated". This is analagous to the `Sem_e_comp` rule
|
|
|
|
in our original semantics: if an expression `e1` starts in stack `vs`
|
|
|
|
and finishes in stack `vs'`, and another expression starts in stack `vs'`
|
|
|
|
and finishes in stack `vs''`, then we can compose these two expressions,
|
|
|
|
and the result will start in `vs` and finish in `vs''`.
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 69 75 >}}
|
|
|
|
|
|
|
|
The proof is very short. We go
|
|
|
|
{{< sidenote "right" "concat-note" "by induction on the left evaluation" >}}
|
|
|
|
It's not a coincidence that defining something like <code>(++)</code>
|
|
|
|
(list concatenation) in Haskell typically starts by pattern matching
|
|
|
|
on the <em>left</em> list. In fact, proofs by <code>induction</code>
|
|
|
|
actually correspond to recursive functions! It's tough putting code blocks
|
|
|
|
in sidenotes, but if you're playing with the
|
|
|
|
<a href="https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/dawn/DawnEval.v"><code>DawnEval.v</code></a> file, try running
|
|
|
|
<code>Print eval_chain_ind</code>, and note the presence of <code>fix</code>,
|
|
|
|
the <a href="https://en.wikipedia.org/wiki/Fixed-point_combinator">fixed point
|
|
|
|
combinator</a> used to implement recursion.
|
|
|
|
{{< /sidenote >}}
|
|
|
|
chain (the one for `e1`). Coq takes care of most of the rest with `auto`.
|
|
|
|
The key to this proof is that whatever `P` is cotained within a "node"
|
|
|
|
in the left chain determines how `eval_step (e_comp e1 e2)` behaves. Whether
|
|
|
|
`e1` evaluates to `final` or `middle`, the composition evaluates to `middle`
|
|
|
|
(a composition of two expressions cannot be done in one step), so we always
|
|
|
|
{{< sidenote "left" "cons-note" "create a new \"cons\" node." >}}
|
|
|
|
This is <em>unlike</em> list concatenation, since we typically don't
|
|
|
|
create new nodes when appending to an empty list.
|
|
|
|
{{< /sidenote >}} via `chain_middle`. Then, the two cases are as follows:
|
|
|
|
|
|
|
|
* If the step was `final`, then
|
|
|
|
the rest of the steps use only `e2`, and good news, we already have a chain
|
|
|
|
for that!
|
|
|
|
* Otherwise, the step was `middle`, an we stll have a chain for some
|
|
|
|
intermediate program `ei` that starts in some stack `vsi` and ends in `vs'`.
|
|
|
|
By induction, we know that _this_ chain can be concatenated with the one
|
|
|
|
for `e2`, resulting in a chain for `e_comp ei e2`. All that remains is to
|
|
|
|
attach to this sub-chain the one step from `(vs, e1)` to `(vsi, ei)`, which
|
|
|
|
is handled by `chain_middle`.
|
|
|
|
|
|
|
|
{{< todo >}}A drawing would be nice here!{{< /todo >}}
|
|
|
|
|
|
|
|
The `merge` operation is reversible; chains can be split into two pieces,
|
|
|
|
one for each composed expression:
|
|
|
|
|
|
|
|
{{< codelines "Coq" "dawn/DawnEval.v" 77 78 >}}
|
|
|
|
|
|
|
|
While interesting, this particular fact isn't used anywhere else in this
|
|
|
|
post, and I will omit the proof here.
|