Fix calling UCC Dawn

This commit is contained in:
Danila Fedorin 2021-11-21 12:38:19 -08:00
parent 18f493675a
commit e7185ff460

View File

@ -7,16 +7,17 @@ tags: ["Coq", "Dawn"]
The [_Foundations of Dawn_](https://www.dawn-lang.org/posts/foundations-ucc/) article came up The [_Foundations of Dawn_](https://www.dawn-lang.org/posts/foundations-ucc/) article came up
on [Lobsters](https://lobste.rs/s/clatuv/foundations_dawn_untyped_concatenative) recently. on [Lobsters](https://lobste.rs/s/clatuv/foundations_dawn_untyped_concatenative) recently.
In this article, the author of Dawn defines a core calculus for the language, and provides its In this article, the author of Dawn defines a core calculus for the language, and provides its
semantics. The definitions seemed so clean and straightforward that I wanted to try my hand at semantics. The core calculus is called the _untyped concatenative calculus_, or UCC.
The definitions in the semantics seemed so clean and straightforward that I wanted to try my hand at
translating them into machine-checked code. I am most familiar with [Coq](https://coq.inria.fr/), translating them into machine-checked code. I am most familiar with [Coq](https://coq.inria.fr/),
and that's what I reached for when making this attempt. and that's what I reached for when making this attempt.
### Defining the Syntax ### Defining the Syntax
#### Expressions and Intrinsics #### Expressions and Intrinsics
This is mostly the easy part. A Dawn expression is one of three things: This is mostly the easy part. A UCC expression is one of three things:
* An "intrinsic", written \\(i\\), which is akin to a built-in function or command. * An "intrinsic", written \\(i\\), which is akin to a built-in function or command.
* A "quote", written \\([e]\\), which takes a Dawn expression \\(e\\) and moves it onto the stack (Dawn is stack-based). * A "quote", written \\([e]\\), which takes a UCC expression \\(e\\) and moves it onto the stack (UCC is stack-based).
* A composition of several expressions, written \\(e_1\\ e_2\\ \\ldots\\ e_n\\), which effectively evaluates them in order. * A composition of several expressions, written \\(e_1\\ e_2\\ \\ldots\\ e_n\\), which effectively evaluates them in order.
This is straightforward to define in Coq, but I'm going to make a little simplifying change. This is straightforward to define in Coq, but I'm going to make a little simplifying change.
@ -35,7 +36,7 @@ only defined for pairs of numbers, like \(a+b\). However, no one really bats an
write \(1+2+3\), because we can just insert parentheses any way we like, and get the same result: write \(1+2+3\), because we can just insert parentheses any way we like, and get the same result:
\((1+2)+3\) is the same as \(1+(2+3)\). \((1+2)+3\) is the same as \(1+(2+3)\).
{{< /sidenote >}} {{< /sidenote >}}
With that in mind, we can translate each of the three types of expressions in Dawn into cases With that in mind, we can translate each of the three types of expressions in UCC into cases
of an inductive data type in Coq. of an inductive data type in Coq.
{{< codelines "Coq" "dawn/Dawn.v" 12 15 >}} {{< codelines "Coq" "dawn/Dawn.v" 12 15 >}}
@ -78,7 +79,7 @@ However, I didn't decide on this approach for two reasons:
* When formalizing the lambda calculus, * When formalizing the lambda calculus,
[Programming Language Foundations](https://softwarefoundations.cis.upenn.edu/plf-current/Stlc.html) [Programming Language Foundations](https://softwarefoundations.cis.upenn.edu/plf-current/Stlc.html)
uses an inductively-defined property to indicate values. In the simply typed lambda calculus, uses an inductively-defined property to indicate values. In the simply typed lambda calculus,
much like in Dawn, values are a subset of expressions. much like in UCC, values are a subset of expressions.
I took instead the approach from Programming Language Foundations: a value is merely an expression I took instead the approach from Programming Language Foundations: a value is merely an expression
for which some predicate, `IsValue`, holds. We will define this such that `IsValue (e_quote e)` is provable, for which some predicate, `IsValue`, holds. We will define this such that `IsValue (e_quote e)` is provable,
@ -102,13 +103,13 @@ propositions. It's special for a few reasons, but those reasons are beyond the s
for our purposes, it's sufficient to think of `IsValue e` as a type. for our purposes, it's sufficient to think of `IsValue e` as a type.
Alright, so what good is this new `IsValue e` type? Well, we will define `IsValue` such that Alright, so what good is this new `IsValue e` type? Well, we will define `IsValue` such that
this type is only _inhabited_ if `e` is a value according to the Dawn specification. A type this type is only _inhabited_ if `e` is a value according to the UCC specification. A type
is inhabited if and only if we can find a value of that type. For instance, the type of natural is inhabited if and only if we can find a value of that type. For instance, the type of natural
numbers, `nat`, is inhabited, because any number, like `0`, has this type. Uninhabited types numbers, `nat`, is inhabited, because any number, like `0`, has this type. Uninhabited types
are harder to come by, but take as an example the type `3 = 4`, the type of proofs that three is equal are harder to come by, but take as an example the type `3 = 4`, the type of proofs that three is equal
to four. Three is _not_ equal to four, so we can never find a proof of equality, and thus, `3 = 4` is to four. Three is _not_ equal to four, so we can never find a proof of equality, and thus, `3 = 4` is
uninhabited. As I said, `IsValue e` will only be inhabited if `e` is a value per the formal uninhabited. As I said, `IsValue e` will only be inhabited if `e` is a value per the formal
specification of Dawn; specifically, this means that `e` is a quoted expression, like `e_quote e'`. specification of UCC; specifically, this means that `e` is a quoted expression, like `e_quote e'`.
To this end, we define `IsValue` as follows: To this end, we define `IsValue` as follows:
@ -122,7 +123,7 @@ this constructor creates a value of type `IsValue (e_quote e)`. Two things are t
* Because `Val_quote` is the _only_ constructor, and because it always returns `IsValue (e_quote e)`, * Because `Val_quote` is the _only_ constructor, and because it always returns `IsValue (e_quote e)`,
there's no way to get `IsValue (e_int i)`, or anything else. there's no way to get `IsValue (e_int i)`, or anything else.
Thus, `IsValue e` is inhabited if and only if `e` is a Dawn value, as we intended. Thus, `IsValue e` is inhabited if and only if `e` is a UCC value, as we intended.
Just one more thing. A value is just an expression, but Coq only knows about this as long Just one more thing. A value is just an expression, but Coq only knows about this as long
as there's an `IsValue` instance around to vouch for it. To be able to reason about values, then, as there's an `IsValue` instance around to vouch for it. To be able to reason about values, then,
@ -145,7 +146,7 @@ this is far from the only type of predicate. Here are some examples:
* The mathematical "less than" relation is also a binary predicate, and it's called `le` in Coq. * The mathematical "less than" relation is also a binary predicate, and it's called `le` in Coq.
It takes two numbers `n` and `m` and returns a type `le n m` that is only inhabited if `n` is less It takes two numbers `n` and `m` and returns a type `le n m` that is only inhabited if `n` is less
than or equal to `m`. than or equal to `m`.
* The evaluation relation in Dawn is a ternary predicate. It takes two stacks, `vs` and `vs'`, * The evaluation relation in UCC is a ternary predicate. It takes two stacks, `vs` and `vs'`,
and an expression, `e`, and creates a type that's inhabited if and only if evaluating and an expression, `e`, and creates a type that's inhabited if and only if evaluating
`e` starting at a stack `vs` results in the stack `vs'`. `e` starting at a stack `vs` results in the stack `vs'`.
@ -156,13 +157,13 @@ to say about the type of `eq`:
eq : ?A -> ?A -> Prop eq : ?A -> ?A -> Prop
``` ```
By a similar logic, ternary predicates, much like Dawn's evaluation relation, are functions By a similar logic, ternary predicates, much like UCC's evaluation relation, are functions
of three inputs. We can thus write the type of our evaluation relation as follows: of three inputs. We can thus write the type of our evaluation relation as follows:
{{< codelines "Coq" "dawn/Dawn.v" 35 35 >}} {{< codelines "Coq" "dawn/Dawn.v" 35 35 >}}
We define the constructors just like we did in our `IsValue` predicate. For each evaluation We define the constructors just like we did in our `IsValue` predicate. For each evaluation
rule in Dawn, such as: rule in UCC, such as:
{{< latex >}} {{< latex >}}
\langle V, v, v'\rangle\ \text{swap}\ \rightarrow\ \langle V, v', v \rangle \langle V, v, v'\rangle\ \text{swap}\ \rightarrow\ \langle V, v', v \rangle
@ -214,10 +215,10 @@ Here, we may as well go through the three constructors to explain what they mean
at stack `vs1` and ending in stack `vs3`. at stack `vs1` and ending in stack `vs3`.
### \\(\\text{true}\\), \\(\\text{false}\\), \\(\\text{or}\\) and Proofs ### \\(\\text{true}\\), \\(\\text{false}\\), \\(\\text{or}\\) and Proofs
Now it's time for some fun! The Dawn language specification starts by defining two values: Now it's time for some fun! The UCC language specification starts by defining two values:
true and false. Why don't we do the same thing? true and false. Why don't we do the same thing?
|Dawn Spec| Coq encoding | |UCC Spec| Coq encoding |
|---|----| |---|----|
|\\(\\text{false}\\)=\\([\\text{drop}]\\)| {{< codelines "Coq" "dawn/Dawn.v" 41 42 >}} |\\(\\text{false}\\)=\\([\\text{drop}]\\)| {{< codelines "Coq" "dawn/Dawn.v" 41 42 >}}
|\\(\\text{true}\\)=\\([\\text{swap} \\ \\text{drop}]\\)| {{< codelines "Coq" "dawn/Dawn.v" 44 45 >}} |\\(\\text{true}\\)=\\([\\text{swap} \\ \\text{drop}]\\)| {{< codelines "Coq" "dawn/Dawn.v" 44 45 >}}
@ -255,7 +256,7 @@ element, as specified. The proof for \\(\\text{true}\\) is very similar in spiri
We can also formalize the \\(\\text{or}\\) operator: We can also formalize the \\(\\text{or}\\) operator:
|Dawn Spec| Coq encoding | |UCC Spec| Coq encoding |
|---|----| |---|----|
|\\(\\text{or}\\)=\\(\\text{clone}\\ \\text{apply}\\)| {{< codelines "Coq" "dawn/Dawn.v" 65 65 >}} |\\(\\text{or}\\)=\\(\\text{clone}\\ \\text{apply}\\)| {{< codelines "Coq" "dawn/Dawn.v" 65 65 >}}
@ -283,9 +284,9 @@ can be expressed using our two new proofs, `or_false_v` and `or_true`.
### Derived Expressions ### Derived Expressions
#### Quotes #### Quotes
The Dawn specification defines \\(\\text{quote}_n\\) to make it more convenient to quote The UCC specification defines \\(\\text{quote}_n\\) to make it more convenient to quote
multiple terms. For example, \\(\\text{quote}_2\\) composes and quotes the first two values multiple terms. For example, \\(\\text{quote}_2\\) composes and quotes the first two values
on the stack. This is defined in terms of other Dawn expressions as follows: on the stack. This is defined in terms of other UCC expressions as follows:
{{< latex >}} {{< latex >}}
\text{quote}_n = \text{quote}_{n-1}\ \text{swap}\ \text{quote}\ \text{swap}\ \text{compose} \text{quote}_n = \text{quote}_{n-1}\ \text{swap}\ \text{quote}\ \text{swap}\ \text{compose}
@ -295,8 +296,8 @@ We can write this in Coq as follows:
{{< codelines "Coq" "dawn/Dawn.v" 90 94 >}} {{< codelines "Coq" "dawn/Dawn.v" 90 94 >}}
This definition diverges slightly from the one given in the Dawn specification; particularly, This definition diverges slightly from the one given in the UCC specification; particularly,
Dawn's spec mentions that \\(\\text{quote}_n\\) is only defined for \\(n \\geq 1\\).However, UCC's spec mentions that \\(\\text{quote}_n\\) is only defined for \\(n \\geq 1\\).However,
this means that in our code, we'd have to somehow handle the error that would arise if the this means that in our code, we'd have to somehow handle the error that would arise if the
term \\(\\text{quote}\_0\\) is used. Instead, I defined `quote_n n` to simply mean term \\(\\text{quote}\_0\\) is used. Instead, I defined `quote_n n` to simply mean
\\(\\text{quote}\_{n+1}\\); thus, in Coq, no matter what `n` we use, we will have a valid \\(\\text{quote}\_{n+1}\\); thus, in Coq, no matter what `n` we use, we will have a valid
@ -361,12 +362,12 @@ ways of writing the composition, if they evaluate to anything, evaluate to the s
{{< codelines "Coq" "dawn/Dawn.v" 170 171 >}} {{< codelines "Coq" "dawn/Dawn.v" 170 171 >}}
### Conclusion ### Conclusion
That's all I've got in me for today. However, we got pretty far! The Dawn specification That's all I've got in me for today. However, we got pretty far! The UCC specification
says: says:
> One of my long term goals for Dawn is to democratize formal software verification in order to make it much more feasible and realistic to write perfect software. > One of my long term goals for UCC is to democratize formal software verification in order to make it much more feasible and realistic to write perfect software.
I think that Dawn is definitely getting there: formally defining the semantics outlined I think that UCC is definitely getting there: formally defining the semantics outlined
on the page was quite straightforward. We can now have complete confidence in the behavior on the page was quite straightforward. We can now have complete confidence in the behavior
of \\(\\text{true}\\), \\(\\text{false}\\), \\(\\text{or}\\), \\(\\text{quote}_n\\) and of \\(\\text{true}\\), \\(\\text{false}\\), \\(\\text{or}\\), \\(\\text{quote}_n\\) and
\\(\\text{rotate}_n\\). The proof of associativity is also enough to possibly argue for simplifying \\(\\text{rotate}_n\\). The proof of associativity is also enough to possibly argue for simplifying