Write the semantics section using Bergamot

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-10-13 13:31:47 -07:00
parent 77dade1d1d
commit 37dd9ad6d4
2 changed files with 382 additions and 0 deletions
--- a/content/blog/05_spa_agda_semantics/index.md
+++ b/content/blog/05_spa_agda_semantics/index.md
@@ -5,6 +5,17 @@ description: "In this post, I define the language that well serve as the object
 date: 2024-08-10T17:37:43-07:00
 tags: ["Agda", "Programming Languages"]
 draft: true
+custom_js: ["parser.js"]
+bergamot:
+    render_presets:
+        default: "bergamot/rendering/imp.bergamot"
+    input_modes:
+        - name: "Expression"
+          fn: "window.parseExpr"
+        - name: "Basic Statement"
+          fn: "window.parseBasicStmt"
+        - name: "Statement"
+          fn: "window.parseStmt"
 ---

 In the previous several posts, I've formalized the notion of lattices, which
@@ -197,3 +208,246 @@ The Agda version is:

 Notice how we used `noop` to express the fact that the `else` branch of the
 conditional does nothing.
+
+### The Semantics of Our Language
+
+We now have all the language constructs that I'll be showing off --- because
+those are all the concepts that I've formalized. What's left is to define
+how they behave. We will do this using a logical tool called
+[_inference rules_](https://en.wikipedia.org/wiki/Rule_of_inference). I've
+written about them a number of times; they're ubiquitous, particularly in the
+sorts of things I like explore on this site. The [section on inference rules]({{< relref "01_aoc_coq#inference-rules" >}})
+from my Advent of Code series is pretty relevant, and [the notation section from
+a post in my compiler series]({{< relref "03_compiler_typechecking#some-notation" >}}) says
+much the same thing; I won't be re-describing them here.
+
+There are three pieces which demand semantics: expressions, simple statements,
+and non-simple statements. The semantics of each of the three requires the
+semantics of the items that precede it. We will therefore start with expressions.
+
+#### Expressions
+
+The trickiest thing about expression is that the value of an expression depends
+on the "context": `x+1` can evaluate to `43` if `x` is `42`, or it can evaluate
+to `0` if `x` is `-1`. To evaluate an expression, we will therefore need to
+assign values to all of the variables in that expression. A mapping that
+assigns values to variables is typically called an _environment_. We will write
+\(\varnothing\) for "empty environment", and \(\{\texttt{x} \mapsto 42, \texttt{y} \mapsto -1 \}\) for
+an environment that maps the variable \(\texttt{x}\) to 42, and the variable \(\texttt{y}\) to -1.
+
+Now, a bit of notation. We will use the letter \(\rho\) to represent environments
+(and if several environments are involved, we will occasionally number them
+as \(\rho_1\), \(\rho_2\), etc.) We will use the letter \(e\) to stand for
+expressions, and the letter \(v\) to stand for values. Finally, we'll write
+\(\rho, e \Downarrow v\) to say that "in an environment \(\rho\), expression \(e\)
+evaluates to value \(v\)". Our two previous examples of evaluating `x+1` can
+thus be written as follows:
+
+{{< latex >}}
+\{ \texttt{x} \mapsto 42 \}, \texttt{x}+1 \Downarrow 43 \\
+\{ \texttt{x} \mapsto -1 \}, \texttt{x}+1 \Downarrow 0 \\
+{{< /latex >}}
+
+Now, on to the actual rules for how to evaluate expressions. Most simply,
+integer literals `1` just evaluate to themselves.
+
+{{< latex >}}
+\frac{n \in \text{Int}}{\rho, n \Downarrow n}
+{{< /latex >}}
+
+Note that the letter \(\rho\) is completely unused in the above rule. That's
+because no matter what values _variables_ have, a number still evaluates to
+the same value. As we've already established, the same is not true for a
+variable like \(\texttt{x}\). To evaluate such a variable, we need to retrieve
+the value it's mapped to in the current environment, which we will write as
+\(\rho(\texttt{x})\). This gives the following inference rule:
+
+{{< latex >}}
+\frac{\rho(x) = v}{\rho, x \Downarrow v}
+{{< /latex >}}
+
+All that's left is to define addition and subtraction. For an expression in the
+form \(e_1+e_2\), we first need to evaluate the two subexpressions \(e_1\)
+and \(e_2\), and then add the two resulting numbers. As a result, the addition
+rule includes two additional premises, one for evaluating each summand.
+
+{{< latex >}}
+\frac
+    {\rho, e_1 \Downarrow v_1 \quad \rho, e_2 \Downarrow v_2 \quad v_1 + v_2 = v}
+    {\rho, e_1+e_2 \Downarrow v}
+{{< /latex >}}
+
+The subtraction rule is similar. Below, I've configured an instance of
+[Bergamot]({{< relref "bergamot" >}}) to interpret these exact rules. Try
+typing various expressions like `1`, `1+1`, etc. into the input box below
+to see them evaluate. If you click the "Full Proof Tree" button, you can also view
+the exact rules that were used in computing a particular value. The variables
+`x`, `y`, and `z` are pre-defined for your convenience.
+
+{{< bergamot_widget id="expr-widget" query="" prompt="eval(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?v)" modes="Expression:Expression" >}}
+section "" {
+  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
+  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
+}
+section "" {
+  EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
+  EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
+}
+hidden section "" {
+  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+}
+{{< /bergamot_widget >}}
+
+The Agda equivalent of this looks very similar to the rules themselves. I use
+`⇒ᵉ` instead of \(\Downarrow\), and there's a little bit of tedium with
+wrapping integers into a new `Value` type. I also used a (partial) relation
+`(x, v) ∈ ρ` instead of explicitly defining accessing an environment, since
+it is conceivable for a user to attempt accessing a variable that has not
+been assigned to. Aside from these notational changes, the structure of each
+of the constructors of the evaluation data type matches the inference rules
+I showed above.
+
+{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 27 35 >}}
+
+#### Simple Statements
+The main difference between formalizing (simple and "normal") statements is
+that they modify the environment. If `x` has one value, writing `x = x + 1` will
+certainly change that value. On the other hands, statements don't produce values.
+So, we will be writing claims like \(\rho_1 , \textit{bs} \Rightarrow \rho_2\)
+to say that the basic statement \(\textit{bs}\), when starting in environment
+\(\rho_1\), will produce environment \(\rho_2\). Here's an example:
+
+{{< latex >}}
+\{  \texttt{x} \mapsto 42, \texttt{y} \mapsto 17 \}, \  \texttt{x = x - \text{1}} \Rightarrow \{  \texttt{x} \mapsto 41, \texttt{y} \mapsto 17 \}
+{{< /latex >}}
+
+Here, we subtracted one from a variable with value `42`, leaving it with a new
+value of `41`.
+
+There are two basic statements, and one of them quite literally does nothing.
+The inference rule for `noop` is very simple:
+
+{{< latex >}}
+\rho,\ \texttt{noop} \Rightarrow \rho
+{{< /latex >}}
+
+For the assignment rule, we need to know how to evaluate the expression on the
+right side of the equal sign. This is why we needed to define the semantics
+of expressions first. Given those, the evaluation rule for assignment is as
+follows, with \(\rho[x \mapsto v]\) meaning "the environment \(\rho\) but
+mapping the variable \(x\) to value \(v\)".
+
+{{< latex >}}
+\frac
+  {\rho, e \Downarrow v}
+  {\rho, x = e \Rightarrow \rho[x \mapsto v]}
+{{< /latex >}}
+
+Those are actually all the rules we need, and below, I am once again configuring
+a Bergamot instance, this time with simple statements. Try out `noop` or some
+sort of variable assignment, like `x = x + 1`.
+
+{{< bergamot_widget id="basic-stmt-widget" query="" prompt="stepbasic(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Basic Statement:Basic Statement" >}}
+hidden section "" {
+  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
+  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
+}
+hidden section "" {
+  EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
+  EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
+}
+section "" {
+  StepNoop @ stepbasic(?rho, noop, ?rho) <-;
+  StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
+}
+hidden section "" {
+  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+}
+{{< /bergamot_widget >}}
+
+The Agda implementation is once again just a data type with constructors-for-rules.
+This time they also look quite similar to the rules I've shown up until now,
+though I continue to explicitly quantify over variables like `ρ`.
+
+{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 37 40 >}}
+
+#### Statements
+
+Let's work on non-simple statements next. The easiest rule to define is probably
+sequencing. When we use `then` (or `;`) to combine two statements, what we
+actually want is to execute the first statement, which may change variables,
+and then execute the second statement while keeping the changes from the first.
+This means there are three environments: \(\rho_1\) for the initial state before
+either statement is executed, \(\rho_2\) for the state between executing the
+first and second statement, and \(\rho_3\) for the final state after both
+are done executing. This leads to the following rule:
+
+{{< latex >}}
+\frac
+    { \rho_1, s_1 \Rightarrow \rho_2 \quad \rho_2, s_2 \Rightarrow \rho_3 }
+    { \rho_1, s_1; s_2 \Rightarrow \rho_3 }
+{{< /latex >}}
+
+We will actually need two rules to evaluate the conditional statement: one
+for when the condition evaluates to "true", and one for when the condition
+evaluates to "false". Only, I never specified booleans as being part of
+the language, which means that we will need to come up what "false" and "true"
+are. I will take my cue from C++ and use zero as "false", and any other number
+as "true".
+
+If the condition of an `if`-`else` statement is "true" (nonzero), then the
+effect of executing the `if`-`else` should be the same as executing the "then"
+part of the statement, while completely ignoring the "else" part.
+
+{{< latex >}}
+\frac
+    { \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1, s_1 \Rightarrow \rho_2}
+    { \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
+{{< /latex >}}
+
+Notice that in the above rule, we used the evaluation judgement \(\rho_1, e \Downarrow v\)
+to evaluate the _expression_ that serves as the condition. We then had an
+additional premise that requires the truthiness of the resulting value \(v\).
+The rule for evaluating a conditional with a "false" branch is very similar.
+
+
+{{< latex >}}
+\frac
+    { \rho_1 , e \Downarrow v \quad v = 0 \quad \rho_1, s_2 \Rightarrow \rho_2}
+    { \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
+{{< /latex >}}
+
+{{< bergamot_widget id="stmt-widget" query="" prompt="step(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Statement:Statement" >}}
+hidden section "" {
+  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
+  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
+}
+hidden section "" {
+  EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
+  EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
+}
+hidden section "" {
+  StepNoop @ stepbasic(?rho, noop, ?rho) <-;
+  StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
+}
+hidden section "" {
+  StepNoop @ stepbasic(?rho, noop, ?rho) <-;
+  StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
+}
+hidden section "" {
+  StepLiftBasic @ step(?rho_1, ?s, ?rho_2) <- stepbasic(?rho_1, ?s, ?rho_2);
+}
+section "" {
+  StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
+  StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), numeq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
+  StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
+  StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), numeq(?v, 0);
+  StepSeq @ step(?rho_1, seq(?s_1, ?s_2), ?rho_3) <- step(?rho_1, ?s_1, ?rho_2), step(?rho_2, ?s_2, ?rho_3);
+}
+hidden section "" {
+  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+}
+{{< /bergamot_widget >}}