Compare commits
No commits in common. "9fc2d16fb87c584439869521796b97a2ab05b14c" and "37dd9ad6d4fc413fcf663a3e25fa9d0be19a433d" have entirely different histories.
9fc2d16fb8
...
37dd9ad6d4
|
@ -7,12 +7,10 @@ SelectHead @ select(cons([?t, ?v], ?rest), ?default, ?v) <- ?t;
|
|||
SelectTail @ select(cons([?t, ?v], ?rest), ?default, ?found) <- not(?t), select(?rest, ?default, ?found);
|
||||
SelectEmpty @ select(nil, ?default, ?default) <-;
|
||||
|
||||
Eq @ eq(?x, ?x) <-;
|
||||
|
||||
ParenthAssocLeft @ parenthassoc(?a_i, left, right) <-;
|
||||
ParenthAssocRight @ parenthassoc(?a_i, right, left) <-;
|
||||
ParenthAssocNone @ parenthassoc(?a_i, none, ?pos) <-;
|
||||
ParenthAssocNeq @ parenthassoc(?a_i, ?a_o, ?pos) <- not(eq(?a_i, ?a_o));
|
||||
ParenthAssocNeq @ parenthassoc(?a_i, ?a_o, ?pos) <- not(symeq(?a_i, ?a_o));
|
||||
|
||||
Parenth @ parenth(?inner, ?outer, ?pos, ?strin, ?strout) <-
|
||||
prec(?inner, ?p_i, ?a_i), prec(?outer, ?p_o, ?a_o),
|
||||
|
|
|
@ -2,14 +2,14 @@
|
|||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 4: The Fixed-Point Algorithm"
|
||||
series: "Static Program Analysis in Agda"
|
||||
description: "In this post, I give a top-level overview of my work on formally verified static analyses"
|
||||
date: 2024-11-03T17:50:26-08:00
|
||||
date: 2024-08-10T17:37:42-07:00
|
||||
tags: ["Agda", "Programming Languages"]
|
||||
draft: true
|
||||
---
|
||||
|
||||
In the preivous post we looked at lattices of finite height, which are a crucial
|
||||
ingredient to our static analyses. In this post, I will describe the specific
|
||||
algorithm that makes use of these lattices; this algorithm will be at the core
|
||||
of this series.
|
||||
algorithm that makes use of these lattices that will be at the core of it all.
|
||||
|
||||
Lattice-based static analyses tend to operate by iteratively combining facts
|
||||
from the program into new ones. For instance, when analyzing `y = 1 + 2`, we
|
||||
|
@ -79,7 +79,7 @@ arrange the ones we've seen so far into a chain:
|
|||
|
||||
Each time we fail to find a fixed point, we add one element to our chain, growing
|
||||
it. But if our lattice \(L\) has a finite height, that means eventually this
|
||||
process will have to stop; the chain can't grow forever. Eventually, we will
|
||||
process will have to stop; the chain can't grow forever; eventually, we will
|
||||
have to find a value such that \(v = f(v)\). Thus, our algorithm is guaranteed
|
||||
to terminate, and give a fixed point.
|
||||
|
||||
|
@ -119,7 +119,7 @@ it provides just a bit more gas than the maximum-length chain, which means that
|
|||
if the gas is exhausted, we've certainly arrived at a contradiction. It also
|
||||
provides an initial chain onto which `doStep` will keep tacking on new inequalities
|
||||
as it finds them. Since we haven't found any yet, this is the single-element
|
||||
chain of \(\bot\). The last thing is does is set up the recursion invariant
|
||||
chain of \(\bot\). The last thing is does is set up the reursion invariant
|
||||
(that the sum of the gas and the chain length is constant), and provides
|
||||
a proof that \(\bot \le f(\bot)\). This function always returns a fixed point.
|
||||
|
||||
|
@ -147,11 +147,11 @@ the lattice, we have \(\bot \le b\).
|
|||
|
||||
{{< latex >}}
|
||||
\begin{array}{ccccccccr}
|
||||
& & \bot & \le & & & b & \quad \implies & \text{(monotonicity of}\ f \text{)}\\
|
||||
& & f(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||
& & f^2(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||
& & \bot & \le & & & b \quad \implies & \text{(monotonicity of}\ f \text{)}\\
|
||||
& & f(\bot) & \le & f(b) & = & b \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||
& & f^2(\bot) & \le & f(b) & = & b \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||
\\
|
||||
& & \vdots & & \vdots & & \vdots & & \\
|
||||
& & \vdots & & \vdots & & \vdots & \\
|
||||
\\
|
||||
a & = & f^k(\bot) & \le & f(b) & = & b &
|
||||
\end{array}
|
||||
|
@ -159,7 +159,7 @@ a & = & f^k(\bot) & \le & f(b) & = & b &
|
|||
|
||||
Because of the monotonicity of \(f\), each time we apply it, it preserves the
|
||||
less-than relationship that started with \(\bot \le b\). Doing that \(k\) times,
|
||||
we verify that \(a\) is our least fixed point.
|
||||
we verify that \(a\) is our fixed point.
|
||||
|
||||
To convince Agda of this proof, we once again get in an argument with the termination
|
||||
checker, which ends the same way it did last time: with us using the notion of 'gas'
|
||||
|
|
|
@ -2,8 +2,9 @@
|
|||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 5: Our Programming Language"
|
||||
series: "Static Program Analysis in Agda"
|
||||
description: "In this post, I define the language that well serve as the object of our vartious analyses"
|
||||
date: 2024-11-03T17:50:27-08:00
|
||||
date: 2024-08-10T17:37:43-07:00
|
||||
tags: ["Agda", "Programming Languages"]
|
||||
draft: true
|
||||
custom_js: ["parser.js"]
|
||||
bergamot:
|
||||
render_presets:
|
||||
|
@ -116,31 +117,11 @@ is called an [Abstract Syntax Tree](https://en.wikipedia.org/wiki/Abstract_synta
|
|||
Notably, though `2-(x+y)` has parentheses, our grammar above does not include
|
||||
include them as a case. The reason for this is that the structure of an
|
||||
abstract syntax tree is sufficient to encode the order in which the operations
|
||||
should be evaluated. Since I lack a nice way of drawing ASTs, I will use
|
||||
an ASCII drawing to show an example.
|
||||
should be evaluated.
|
||||
|
||||
```
|
||||
Expression: 2 - (x+y)
|
||||
(-)
|
||||
/ \
|
||||
2 (+)
|
||||
/ \
|
||||
x y
|
||||
|
||||
|
||||
Expression: (2-x) + y
|
||||
(+)
|
||||
/ \
|
||||
(-) y
|
||||
/ \
|
||||
2 x
|
||||
```
|
||||
|
||||
Above, in the first AST, `(+)` is a child of the `(-)` node, which means
|
||||
that it's a sub-expression. As a result, that subexpression is evaluated first,
|
||||
before evaluating `(-)`, and so, the AST expresents `2-(x+y)`. In the other
|
||||
example, `(-)` is a child of `(+)`, and is therefore evaluated first. The resulting
|
||||
association encoded by that AST is `(2-x)+y`.
|
||||
{{< todo >}}
|
||||
Probably two drawings of differently-associated ASTs here.
|
||||
{{< /todo >}}
|
||||
|
||||
To an Agda programmer, the one-of-four-things definition above should read
|
||||
quite similarly to the definition of an algebraic data type. Indeed, this
|
||||
|
@ -268,7 +249,7 @@ thus be written as follows:
|
|||
{{< /latex >}}
|
||||
|
||||
Now, on to the actual rules for how to evaluate expressions. Most simply,
|
||||
integer literals like `1` just evaluate to themselves.
|
||||
integer literals `1` just evaluate to themselves.
|
||||
|
||||
{{< latex >}}
|
||||
\frac{n \in \text{Int}}{\rho, n \Downarrow n}
|
||||
|
@ -304,9 +285,6 @@ the exact rules that were used in computing a particular value. The variables
|
|||
`x`, `y`, and `z` are pre-defined for your convenience.
|
||||
|
||||
{{< bergamot_widget id="expr-widget" query="" prompt="eval(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?v)" modes="Expression:Expression" >}}
|
||||
hidden section "" {
|
||||
Eq @ eq(?x, ?x) <-;
|
||||
}
|
||||
section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
|
@ -317,7 +295,7 @@ section "" {
|
|||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
|
@ -335,7 +313,7 @@ I showed above.
|
|||
#### Simple Statements
|
||||
The main difference between formalizing (simple and "normal") statements is
|
||||
that they modify the environment. If `x` has one value, writing `x = x + 1` will
|
||||
certainly change that value. On the other hand, statements don't produce values.
|
||||
certainly change that value. On the other hands, statements don't produce values.
|
||||
So, we will be writing claims like \(\rho_1 , \textit{bs} \Rightarrow \rho_2\)
|
||||
to say that the basic statement \(\textit{bs}\), when starting in environment
|
||||
\(\rho_1\), will produce environment \(\rho_2\). Here's an example:
|
||||
|
@ -371,9 +349,6 @@ a Bergamot instance, this time with simple statements. Try out `noop` or some
|
|||
sort of variable assignment, like `x = x + 1`.
|
||||
|
||||
{{< bergamot_widget id="basic-stmt-widget" query="" prompt="stepbasic(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Basic Statement:Basic Statement" >}}
|
||||
hidden section "" {
|
||||
Eq @ eq(?x, ?x) <-;
|
||||
}
|
||||
hidden section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
|
@ -388,7 +363,7 @@ section "" {
|
|||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
|
@ -437,52 +412,14 @@ to evaluate the _expression_ that serves as the condition. We then had an
|
|||
additional premise that requires the truthiness of the resulting value \(v\).
|
||||
The rule for evaluating a conditional with a "false" branch is very similar.
|
||||
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1 , e \Downarrow v \quad v = 0 \quad \rho_1, s_2 \Rightarrow \rho_2}
|
||||
{ \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
|
||||
{{< /latex >}}
|
||||
|
||||
Now that we have rules for conditional statements, it will be surprisingly easy
|
||||
to define the rules for `while` loops. A `while` loop will also have two rules,
|
||||
one for when its condition is truthy and one for when it's falsey. However,
|
||||
unlike the "false" case, a while loop will do nothing, leaving the environment
|
||||
unchanged:
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1 , e \Downarrow v \quad v = 0 }
|
||||
{ \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_1 }
|
||||
{{< /latex >}}
|
||||
|
||||
The trickiest rule is for when the condition of a `while` loop is true.
|
||||
We evaluate the body once, starting in environment \(\rho_1\) and finishing
|
||||
in \(\rho_2\), but then we're not done. In fact, we have to go back to the top,
|
||||
and check the condition again, starting over. As a result, we include another
|
||||
premise, that tells us that evaluating the loop starting at \(\rho_2\), we
|
||||
eventually end in state \(\rho_3\). This encodes the "rest of the iterations"
|
||||
in addition to the one we just performed. The environment \(\rho_3\) is our
|
||||
final state, so that's what we use in the rule's conclusion.
|
||||
|
||||
{{< latex >}}
|
||||
\frac
|
||||
{ \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1 , s \Rightarrow \rho_2 \quad \rho_2 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
|
||||
{ \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
|
||||
{{< /latex >}}
|
||||
|
||||
And that's it! We have now seen every rule that defines the little object language
|
||||
I've been using for my Agda work. Below is a Bergamot widget that implements
|
||||
these rules. Try the following program, which computes the `x`th power of two,
|
||||
and stores it in `y`:
|
||||
|
||||
```
|
||||
x = 5; y = 1; while (x) { y = y + y; x = x - 1 }
|
||||
```
|
||||
|
||||
{{< bergamot_widget id="stmt-widget" query="" prompt="step(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Statement:Statement" >}}
|
||||
hidden section "" {
|
||||
Eq @ eq(?x, ?x) <-;
|
||||
}
|
||||
hidden section "" {
|
||||
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||
|
@ -503,47 +440,14 @@ hidden section "" {
|
|||
StepLiftBasic @ step(?rho_1, ?s, ?rho_2) <- stepbasic(?rho_1, ?s, ?rho_2);
|
||||
}
|
||||
section "" {
|
||||
StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
|
||||
StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), eq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
|
||||
StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
|
||||
StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), eq(?v, 0);
|
||||
StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
|
||||
StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), numeq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
|
||||
StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(numeq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
|
||||
StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), numeq(?v, 0);
|
||||
StepSeq @ step(?rho_1, seq(?s_1, ?s_2), ?rho_3) <- step(?rho_1, ?s_1, ?rho_2), step(?rho_2, ?s_2, ?rho_3);
|
||||
}
|
||||
hidden section "" {
|
||||
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(symeq(?x, ?y));
|
||||
}
|
||||
{{< /bergamot_widget >}}
|
||||
|
||||
As with all the other rules we've seen, the mathematical notation above can
|
||||
be directly translated into Agda:
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 47 64 >}}
|
||||
|
||||
### Semantics as Ground Truth
|
||||
|
||||
Prior to this post, we had been talking about using lattices and monotone
|
||||
functions for program analysis. The key problem with using this framework to
|
||||
define analyses is that there are many monotone functions that produce complete
|
||||
nonsese; their output is, at best, unrelated to the program they're supposed
|
||||
to analyze. We don't want to write such functions, since having incorrect
|
||||
information about the programs in question is unhelpful.
|
||||
|
||||
What does it mean for a function to produce correct information, though?
|
||||
In the context of sign analysis, it would mean that if we say a variable `x` is `+`,
|
||||
then evaluating the program will leave us in a state in which `x` is posive.
|
||||
The semantics we defined in this post give us the "evaluating the program piece".
|
||||
They establish what the programs _actually_ do, and we can use this ground
|
||||
truth when checking that our analyses are correct. In subsequent posts, I will
|
||||
prove the exact property I informally stated above: __for the program analyses
|
||||
we define, things they "claim" about our program will match what actually happens
|
||||
when executing the program using our semantics__.
|
||||
|
||||
A piece of the puzzle still remains: how are we going to use the monotone
|
||||
functions we've been talking so much about? We need to figure out what to feed
|
||||
to our analyses before we can prove their correctness.
|
||||
|
||||
I have an answer to that question: we will be using _control flow graphs_ (CFGs).
|
||||
These are another program representation, one that's more commonly found in
|
||||
compilers. I will show what they look like in the next post. I hope to see you
|
||||
there!
|
||||
|
|
Loading…
Reference in New Issue
Block a user