Write the semantics section using Bergamot
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
@@ -5,6 +5,17 @@ description: "In this post, I define the language that well serve as the object
|
||||
date: 2024-08-10T17:37:43-07:00
|
||||
tags: ["Agda", "Programming Languages"]
|
||||
draft: true
|
||||
custom_js: ["parser.js"]
|
||||
bergamot:
|
||||
render_presets:
|
||||
default: "bergamot/rendering/imp.bergamot"
|
||||
input_modes:
|
||||
- name: "Expression"
|
||||
fn: "window.parseExpr"
|
||||
- name: "Basic Statement"
|
||||
fn: "window.parseBasicStmt"
|
||||
- name: "Statement"
|
||||
fn: "window.parseStmt"
|
||||
---
|
||||
|
||||
In the previous several posts, I've formalized the notion of lattices, which
|
||||
@@ -197,3 +208,246 @@ The Agda version is:
|
||||
|
||||
Notice how we used `noop` to express the fact that the `else` branch of the
|
||||
conditional does nothing.
|
||||
|
||||
### The Semantics of Our Language
|
||||
|
||||
We now have all the language constructs that I'll be showing off --- because
|
||||
those are all the concepts that I've formalized. What's left is to define
|
||||
how they behave. We will do this using a logical tool called
|
||||
[_inference rules_](https://en.wikipedia.org/wiki/Rule_of_inference). I've
|
||||
written about them a number of times; they're ubiquitous, particularly in the
|
||||
sorts of things I like explore on this site. The [section on inference rules]({{< relref "01_aoc_coq#inference-rules" >}})
|
||||
from my Advent of Code series is pretty relevant, and [the notation section from
|
||||
a post in my compiler series]({{< relref "03_compiler_typechecking#some-notation" >}}) says
|
||||
much the same thing; I won't be re-describing them here.
|
||||
|
||||
There are three pieces which demand semantics: expressions, simple statements,
|
||||
and non-simple statements. The semantics of each of the three requires the
|
||||
semantics of the items that precede it. We will therefore start with expressions.
|
||||
|
||||
#### Expressions
|
||||
|
||||
The trickiest thing about expression is that the value of an expression depends
|
||||
on the "context": `x+1` can evaluate to `43` if `x` is `42`, or it can evaluate
|
||||
to `0` if `x` is `-1`. To evaluate an expression, we will therefore need to
|
||||
assign values to all of the variables in that expression. A mapping that
|
||||
assigns values to variables is typically called an _environment_. We will write
|
||||
\(\varnothing\) for "empty environment", and \(\{\texttt{x} \mapsto 42, \texttt{y} \mapsto -1 \}\) for
|
||||
an environment that maps the variable \(\texttt{x}\) to 42, and the variable \(\texttt{y}\) to -1.
|
||||
|
||||
Now, a bit of notation. We will use the letter \(\rho\) to represent environments
|
||||
(and if several environments are involved, we will occasionally number them
|
||||
as \(\rho_1\), \(\rho_2\), etc.) We will use the letter \(e\) to stand for
|
||||
expressions, and the letter \(v\) to stand for values. Finally, we'll write
|
||||
\(\rho, e \Downarrow v\) to say that "in an environment \(\rho\), expression \(e\)
|
||||
evaluates to value \(v\)". Our two previous examples of evaluating `x+1` can
|
||||
thus be written as follows:
|
||||
|
||||
{{< latex >}}
|
||||
\{ \texttt{x} \mapsto 42 \}, \texttt{x}+1 \Downarrow 43 \\
|
||||
\{ \texttt{x} \mapsto -1 \}, \texttt{x}+1 \Downarrow 0 \\
|
||||
{{< /latex >}}
|
||||
|
||||
Now, on to the actual rules for how to evaluate expressions. Most simply,
|
||||
integer literals `1` just evaluate to themselves.
|
||||
|
||||
{{< latex >}}
|
||||
\frac{n \in \text{Int}}{\rho, n \Downarrow n}
|
||||
{{< /latex >}}
|
||||
|
||||
Note that the letter \(\rho\) is completely unused in the above rule. That's
|
||||
because no matter what values _variables_ have, a number still evaluates to
|
||||
the same value. As we've already established, the same is not true for a
|
||||
variable like \(\texttt{x}\). To evaluate such a variable, we need to retrieve
|
||||
the value it's mapped to in the current environment, which we will write as
|
||||
\(\rho(\texttt{x})\). This gives the following inference rule:
|
||||
|
||||
{{< latex >}}
|
||||
\frac{\rho(x) = v}{\rho, x \Downarrow v}
|
||||
{{< /latex >}}
|
||||
|
||||
All that's left is to define addition and subtraction. For an expression in the
|
||||
form \(e_1+e_2\), we first need to evaluate the two subexpressions \(e_1\)
|
||||
and \(e_2\), and then add the two resulting numbers. As a result, the addition
|
||||
rule includes two additional premises, one for evaluating each summand.
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{\rho, e_1 \Downarrow v_1 \quad \rho, e_2 \Downarrow v_2 \quad v_1 + v_2 = v}
|
||||
{\rho, e_1+e_2 \Downarrow v}
|
||||
{{< /latex >}}
|
||||
|
||||
The subtraction rule is similar. Below, I've configured an instance of
|
||||
[Bergamot]({{< relref "bergamot" >}}) to interpret these exact rules. Try
|
||||
typing various expressions like `1`, `1+1`, etc. into the input box below
|
||||
to see them evaluate. If you click the "Full Proof Tree" button, you can also view
|
||||
the exact rules that were used in computing a particular value. The variables
|
||||
`x`, `y`, and `z` are pre-defined for your convenience.
|
||||
|
||||
{{< bergamot_widget id="expr-widget" query="" prompt="eval(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?v)" modes="Expression:Expression" >}}
|
||||
section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
}
|
||||
section "" {
|
||||
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
The Agda equivalent of this looks very similar to the rules themselves. I use
|
||||
`⇒ᵉ` instead of \(\Downarrow\), and there's a little bit of tedium with
|
||||
wrapping integers into a new `Value` type. I also used a (partial) relation
|
||||
`(x, v) ∈ ρ` instead of explicitly defining accessing an environment, since
|
||||
it is conceivable for a user to attempt accessing a variable that has not
|
||||
been assigned to. Aside from these notational changes, the structure of each
|
||||
of the constructors of the evaluation data type matches the inference rules
|
||||
I showed above.
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 27 35 >}}
|
||||
|
||||
#### Simple Statements
|
||||
The main difference between formalizing (simple and "normal") statements is
|
||||
that they modify the environment. If `x` has one value, writing `x = x + 1` will
|
||||
certainly change that value. On the other hands, statements don't produce values.
|
||||
So, we will be writing claims like \(\rho_1 , \textit{bs} \Rightarrow \rho_2\)
|
||||
to say that the basic statement \(\textit{bs}\), when starting in environment
|
||||
\(\rho_1\), will produce environment \(\rho_2\). Here's an example:
|
||||
|
||||
{{< latex >}}
|
||||
\{ \texttt{x} \mapsto 42, \texttt{y} \mapsto 17 \}, \ \texttt{x = x - \text{1}} \Rightarrow \{ \texttt{x} \mapsto 41, \texttt{y} \mapsto 17 \}
|
||||
{{< /latex >}}
|
||||
|
||||
Here, we subtracted one from a variable with value `42`, leaving it with a new
|
||||
value of `41`.
|
||||
|
||||
There are two basic statements, and one of them quite literally does nothing.
|
||||
The inference rule for `noop` is very simple:
|
||||
|
||||
{{< latex >}}
|
||||
\rho,\ \texttt{noop} \Rightarrow \rho
|
||||
{{< /latex >}}
|
||||
|
||||
For the assignment rule, we need to know how to evaluate the expression on the
|
||||
right side of the equal sign. This is why we needed to define the semantics
|
||||
of expressions first. Given those, the evaluation rule for assignment is as
|
||||
follows, with \(\rho[x \mapsto v]\) meaning "the environment \(\rho\) but
|
||||
mapping the variable \(x\) to value \(v\)".
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{\rho, e \Downarrow v}
|
||||
{\rho, x = e \Rightarrow \rho[x \mapsto v]}
|
||||
{{< /latex >}}
|
||||
|
||||
Those are actually all the rules we need, and below, I am once again configuring
|
||||
a Bergamot instance, this time with simple statements. Try out `noop` or some
|
||||
sort of variable assignment, like `x = x + 1`.
|
||||
|
||||
{{< bergamot_widget id="basic-stmt-widget" query="" prompt="stepbasic(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Basic Statement:Basic Statement" >}}
|
||||
hidden section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
}
|
||||
hidden section "" {
|
||||
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||
}
|
||||
section "" {
|
||||
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
The Agda implementation is once again just a data type with constructors-for-rules.
|
||||
This time they also look quite similar to the rules I've shown up until now,
|
||||
though I continue to explicitly quantify over variables like `ρ`.
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 37 40 >}}
|
||||
|
||||
#### Statements
|
||||
|
||||
Let's work on non-simple statements next. The easiest rule to define is probably
|
||||
sequencing. When we use `then` (or `;`) to combine two statements, what we
|
||||
actually want is to execute the first statement, which may change variables,
|
||||
and then execute the second statement while keeping the changes from the first.
|
||||
This means there are three environments: \(\rho_1\) for the initial state before
|
||||
either statement is executed, \(\rho_2\) for the state between executing the
|
||||
first and second statement, and \(\rho_3\) for the final state after both
|
||||
are done executing. This leads to the following rule:
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1, s_1 \Rightarrow \rho_2 \quad \rho_2, s_2 \Rightarrow \rho_3 }
|
||||
{ \rho_1, s_1; s_2 \Rightarrow \rho_3 }
|
||||
{{< /latex >}}
|
||||
|
||||
We will actually need two rules to evaluate the conditional statement: one
|
||||
for when the condition evaluates to "true", and one for when the condition
|
||||
evaluates to "false". Only, I never specified booleans as being part of
|
||||
the language, which means that we will need to come up what "false" and "true"
|
||||
are. I will take my cue from C++ and use zero as "false", and any other number
|
||||
as "true".
|
||||
|
||||
If the condition of an `if`-`else` statement is "true" (nonzero), then the
|
||||
effect of executing the `if`-`else` should be the same as executing the "then"
|
||||
part of the statement, while completely ignoring the "else" part.
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1, s_1 \Rightarrow \rho_2}
|
||||
{ \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
|
||||
{{< /latex >}}
|
||||
|
||||
Notice that in the above rule, we used the evaluation judgement \(\rho_1, e \Downarrow v\)
|
||||
to evaluate the _expression_ that serves as the condition. We then had an
|
||||
additional premise that requires the truthiness of the resulting value \(v\).
|
||||
The rule for evaluating a conditional with a "false" branch is very similar.
|
||||
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1 , e \Downarrow v \quad v = 0 \quad \rho_1, s_2 \Rightarrow \rho_2}
|
||||
{ \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
|
||||
{{< /latex >}}
|
||||
|
||||
{{< bergamot_widget id="stmt-widget" query="" prompt="step(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Statement:Statement" >}}
|
||||
hidden section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
}
|
||||
hidden section "" {
|
||||
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||
}
|
||||
hidden section "" {
|
||||
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||
}
|
||||
hidden section "" {
|
||||
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||
}
|
||||
hidden section "" {
|
||||
StepLiftBasic @ step(?rho_1, ?s, ?rho_2) <- stepbasic(?rho_1, ?s, ?rho_2);
|
||||
}
|
||||
section "" {
|
||||
StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
|
||||
StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), numeq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
|
||||
StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
|
||||
StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), numeq(?v, 0);
|
||||
StepSeq @ step(?rho_1, seq(?s_1, ?s_2), ?rho_3) <- step(?rho_1, ?s_1, ?rho_2), step(?rho_2, ?s_2, ?rho_3);
|
||||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
Reference in New Issue
Block a user