Adjust dates of part 4 and part 5 to match real publication dates

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
Edit and publish SPA part 5
2024-11-03 17:51:13 -08:00 · 2024-11-03 17:50:11 -08:00 · 2024-11-03 17:37:44 -08:00 · 2024-11-03 17:12:39 -08:00 · 2024-11-03 17:06:19 -08:00 · 2024-11-03 16:58:06 -08:00
3 changed files with 125 additions and 27 deletions
--- a/assets/bergamot/rendering/lc.bergamot
+++ b/assets/bergamot/rendering/lc.bergamot
@@ -7,10 +7,12 @@ SelectHead @ select(cons([?t, ?v], ?rest), ?default, ?v) <- ?t;
 SelectTail @ select(cons([?t, ?v], ?rest), ?default, ?found) <- not(?t), select(?rest, ?default, ?found);
 SelectEmpty @ select(nil, ?default, ?default) <-;

+Eq @ eq(?x, ?x) <-;
+
 ParenthAssocLeft @ parenthassoc(?a_i, left, right) <-;
 ParenthAssocRight @ parenthassoc(?a_i, right, left) <-;
 ParenthAssocNone @ parenthassoc(?a_i, none, ?pos) <-;
-ParenthAssocNeq @ parenthassoc(?a_i, ?a_o, ?pos) <- not(symeq(?a_i, ?a_o));
+ParenthAssocNeq @ parenthassoc(?a_i, ?a_o, ?pos) <- not(eq(?a_i, ?a_o));

 Parenth @ parenth(?inner, ?outer, ?pos, ?strin, ?strout) <-
    prec(?inner, ?p_i, ?a_i), prec(?outer, ?p_o, ?a_o),
--- a/content/blog/04_spa_agda_fixedpoint.md
+++ b/content/blog/04_spa_agda_fixedpoint.md
@@ -2,14 +2,14 @@
 title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 4: The Fixed-Point Algorithm"
 series: "Static Program Analysis in Agda"
 description: "In this post, I give a top-level overview of my work on formally verified static analyses"
-date: 2024-08-10T17:37:42-07:00
+date: 2024-11-03T17:50:26-08:00
 tags: ["Agda", "Programming Languages"]
-draft: true
 ---

 In the preivous post we looked at lattices of finite height, which are a crucial
 ingredient to our static analyses. In this post, I will describe the specific
-algorithm that makes use of these lattices that will be at the core of it all.
+algorithm that makes use of these lattices; this algorithm will be at the core
+of this series.

 Lattice-based static analyses tend to operate by iteratively combining facts
 from the program into new ones. For instance, when analyzing `y = 1 + 2`, we
@@ -79,7 +79,7 @@ arrange the ones we've seen so far into a chain:

 Each time we fail to find a fixed point, we add one element to our chain, growing
 it. But if our lattice \(L\) has a finite height, that means eventually this
-process will have to stop; the chain can't grow forever; eventually, we will
+process will have to stop; the chain can't grow forever. Eventually, we will
 have to find a value such that \(v = f(v)\). Thus, our algorithm is guaranteed
 to terminate, and give a fixed point.

@@ -119,7 +119,7 @@ it provides just a bit more gas than the maximum-length chain, which means that
 if the gas is exhausted, we've certainly arrived at a contradiction. It also
 provides an initial chain onto which `doStep` will keep tacking on new inequalities
 as it finds them. Since we haven't found any yet, this is the single-element
-chain of \(\bot\). The last thing is does is set up the reursion invariant
+chain of \(\bot\). The last thing is does is set up the recursion invariant
 (that the sum of the gas and the chain length is constant), and provides
 a proof that \(\bot \le f(\bot)\). This function always returns a fixed point.

@@ -147,11 +147,11 @@ the lattice, we have \(\bot \le b\).

 {{< latex >}}
 \begin{array}{ccccccccr}
-& & \bot    & \le &      &   & b \quad \implies & \text{(monotonicity of}\ f \text{)}\\
-& & f(\bot) & \le & f(b) & = & b \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
-& & f^2(\bot) & \le & f(b) & = & b \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
+& & \bot    & \le &      &   & b & \quad \implies & \text{(monotonicity of}\ f \text{)}\\
+& & f(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
+& & f^2(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
 \\
-& & \vdots & & \vdots & & \vdots & \\
+& & \vdots & & \vdots & & \vdots & & \\
 \\
 a & = & f^k(\bot) & \le & f(b) & = & b &
 \end{array}
@@ -159,7 +159,7 @@ a & = & f^k(\bot) & \le & f(b) & = & b &

 Because of the monotonicity of \(f\), each time we apply it, it preserves the
 less-than relationship that started with \(\bot \le b\). Doing that \(k\) times,
-we verify that \(a\) is our fixed point.
+we verify that \(a\) is our least fixed point.

 To convince Agda of this proof, we once again get in an argument with the termination
 checker, which ends the same way it did last time: with us using the notion of 'gas'
--- a/content/blog/05_spa_agda_semantics/index.md
+++ b/content/blog/05_spa_agda_semantics/index.md
@@ -2,9 +2,8 @@
 title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 5: Our Programming Language"
 series: "Static Program Analysis in Agda"
 description: "In this post, I define the language that well serve as the object of our vartious analyses"
-date: 2024-08-10T17:37:43-07:00
+date: 2024-11-03T17:50:27-08:00
 tags: ["Agda", "Programming Languages"]
-draft: true
 custom_js: ["parser.js"]
 bergamot:
    render_presets:
@@ -117,11 +116,31 @@ is called an [Abstract Syntax Tree](https://en.wikipedia.org/wiki/Abstract_synta
 Notably, though `2-(x+y)` has parentheses, our grammar above does not include
 include them as a case. The reason for this is that the structure of an
 abstract syntax tree is sufficient to encode the order in which the operations
-should be evaluated.
+should be evaluated. Since I lack a nice way of drawing ASTs, I will use
+an ASCII drawing to show an example.

-{{< todo >}}
-Probably two drawings of differently-associated ASTs here.
-{{< /todo >}}
+```
+Expression: 2 - (x+y)
+    (-)
+   /   \
+  2    (+)
+      /   \
+     x     y
+
+
+Expression: (2-x) + y
+       (+)
+      /   \
+    (-)    y
+   /   \
+  2     x
+```
+
+Above, in the first AST, `(+)` is a child of the `(-)` node, which means
+that it's a sub-expression. As a result, that subexpression is evaluated first,
+before evaluating `(-)`, and so, the AST expresents `2-(x+y)`. In the other
+example, `(-)` is a child of `(+)`, and is therefore evaluated first. The resulting
+association encoded by that AST is `(2-x)+y`.

 To an Agda programmer, the one-of-four-things definition above should read
 quite similarly to the definition of an algebraic data type. Indeed, this
@@ -249,7 +268,7 @@ thus be written as follows:
 {{< /latex >}}

 Now, on to the actual rules for how to evaluate expressions. Most simply,
-integer literals `1` just evaluate to themselves.
+integer literals like `1` just evaluate to themselves.

 {{< latex >}}
 \frac{n \in \text{Int}}{\rho, n \Downarrow n}
@@ -285,6 +304,9 @@ the exact rules that were used in computing a particular value. The variables
 `x`, `y`, and `z` are pre-defined for your convenience.

 {{< bergamot_widget id="expr-widget" query="" prompt="eval(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?v)" modes="Expression:Expression" >}}
+hidden section "" {
+  Eq @ eq(?x, ?x) <-;
+}
 section "" {
  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
@@ -295,7 +317,7 @@ section "" {
 }
 hidden section "" {
  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
-  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
 }
 {{< /bergamot_widget >}}

@@ -313,7 +335,7 @@ I showed above.
 #### Simple Statements
 The main difference between formalizing (simple and "normal") statements is
 that they modify the environment. If `x` has one value, writing `x = x + 1` will
-certainly change that value. On the other hands, statements don't produce values.
+certainly change that value. On the other hand, statements don't produce values.
 So, we will be writing claims like \(\rho_1 , \textit{bs} \Rightarrow \rho_2\)
 to say that the basic statement \(\textit{bs}\), when starting in environment
 \(\rho_1\), will produce environment \(\rho_2\). Here's an example:
@@ -349,6 +371,9 @@ a Bergamot instance, this time with simple statements. Try out `noop` or some
 sort of variable assignment, like `x = x + 1`.

 {{< bergamot_widget id="basic-stmt-widget" query="" prompt="stepbasic(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Basic Statement:Basic Statement" >}}
+hidden section "" {
+  Eq @ eq(?x, ?x) <-;
+}
 hidden section "" {
  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
@@ -363,7 +388,7 @@ section "" {
 }
 hidden section "" {
  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
-  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
 }
 {{< /bergamot_widget >}}

@@ -412,14 +437,52 @@ to evaluate the _expression_ that serves as the condition. We then had an
 additional premise that requires the truthiness of the resulting value \(v\).
 The rule for evaluating a conditional with a "false" branch is very similar.

-
 {{< latex >}}
 \frac
    { \rho_1 , e \Downarrow v \quad v = 0 \quad \rho_1, s_2 \Rightarrow \rho_2}
    { \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
 {{< /latex >}}

+Now that we have rules for conditional statements, it will be surprisingly easy
+to define the rules for `while` loops. A `while` loop will also have two rules,
+one for when its condition is truthy and one for when it's falsey. However,
+unlike the "false" case, a while loop will do nothing, leaving the environment
+unchanged:
+
+{{< latex >}}
+\frac
+    { \rho_1 , e \Downarrow v \quad v = 0 }
+    { \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_1 }
+{{< /latex >}}
+
+The trickiest rule is for when the condition of a `while` loop is true.
+We evaluate the body once, starting in environment \(\rho_1\) and finishing
+in \(\rho_2\), but then we're not done. In fact, we have to go back to the top,
+and check the condition again, starting over. As a result, we include another
+premise, that tells us that evaluating the loop starting at \(\rho_2\), we
+eventually end in state \(\rho_3\). This encodes the "rest of the iterations"
+in addition to the one we just performed. The environment \(\rho_3\) is our
+final state, so that's what we use in the rule's conclusion.
+
+{{< latex >}}
+\frac
+    { \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1 , s \Rightarrow \rho_2 \quad \rho_2 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
+    { \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
+{{< /latex >}}
+
+And that's it! We have now seen every rule that defines the little object language
+I've been using for my Agda work. Below is a Bergamot widget that implements
+these rules. Try the following program, which computes the `x`th power of two,
+and stores it in `y`:
+
+```
+x = 5; y = 1; while (x) { y = y + y; x = x - 1 }
+```
+
 {{< bergamot_widget id="stmt-widget" query="" prompt="step(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Statement:Statement" >}}
+hidden section "" {
+  Eq @ eq(?x, ?x) <-;
+}
 hidden section "" {
  EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
  EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
@@ -440,14 +503,47 @@ hidden section "" {
  StepLiftBasic @ step(?rho_1, ?s, ?rho_2) <- stepbasic(?rho_1, ?s, ?rho_2);
 }
 section "" {
-  StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
-  StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), numeq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
-  StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
-  StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), numeq(?v, 0);
+  StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
+  StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), eq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
+  StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
+  StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), eq(?v, 0);
  StepSeq @ step(?rho_1, seq(?s_1, ?s_2), ?rho_3) <- step(?rho_1, ?s_1, ?rho_2), step(?rho_2, ?s_2, ?rho_3);
 }
 hidden section "" {
  EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
-  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
+  EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
 }
 {{< /bergamot_widget >}}
+
+As with all the other rules we've seen, the mathematical notation above can
+be directly translated into Agda:
+
+{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 47 64 >}}
+
+### Semantics as Ground Truth
+
+Prior to this post, we had been talking about using lattices and monotone
+functions for program analysis. The key problem with using this framework to
+define analyses is that there are many monotone functions that produce complete
+nonsese; their output is, at best, unrelated to the program they're supposed
+to analyze. We don't want to write such functions, since having incorrect
+information about the programs in question is unhelpful.
+
+What does it mean for a function to produce correct information, though?
+In the context of sign analysis, it would mean that if we say a variable `x` is `+`,
+then evaluating the program will leave us in a state in which `x` is posive.
+The semantics we defined in this post give us the "evaluating the program piece".
+They establish what the programs _actually_ do, and we can use this ground
+truth when checking that our analyses are correct. In subsequent posts, I will
+prove the exact property I informally stated above: __for the program analyses
+we define, things they "claim" about our program will match what actually happens
+when executing the program using our semantics__.
+
+A piece of the puzzle still remains: how are we going to use the monotone
+functions we've been talking so much about? We need to figure out what to feed
+to our analyses before we can prove their correctness.
+
+I have an answer to that question: we will be using _control flow graphs_ (CFGs).
+These are another program representation, one that's more commonly found in
+compilers. I will show what they look like in the next post. I hope to see you
+there!
Author	SHA1	Message	Date
Danila Fedorin	9fc2d16fb8	Adjust dates of part 4 and part 5 to match real publication dates Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 17:51:13 -08:00
Danila Fedorin	f00c69f02c	Edit and publish SPA part 5	2024-11-03 17:50:11 -08:00
Danila Fedorin	4fc1191d13	Proofread and publish part 4 Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 17:37:44 -08:00
Danila Fedorin	951aafc90a	Stop using 'symeq' Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 17:12:39 -08:00
Danila Fedorin	ee13409b33	Add a visualization of two ASTs Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 17:06:19 -08:00
Danila Fedorin	615aeb72da	Finish most of part 5 of SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 16:58:06 -08:00
Danila Fedorin	3be67ca4c8	Use unification-based 'eq' for numbers and symbols Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>	2024-11-03 16:05:04 -08:00