Add a draft post on forward analysis
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
147658ee89
commit
c1b27a13ae
|
@ -106,4 +106,5 @@ Here are the posts that I’ve written so far for this series:
|
||||||
* {{< draftlink "Our Programming Language" "05_spa_agda_semantics" >}}
|
* {{< draftlink "Our Programming Language" "05_spa_agda_semantics" >}}
|
||||||
* {{< draftlink "Control Flow Graphs" "06_spa_agda_cfg" >}}
|
* {{< draftlink "Control Flow Graphs" "06_spa_agda_cfg" >}}
|
||||||
* {{< draftlink "Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}
|
* {{< draftlink "Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}
|
||||||
* {{< draftlink "A Verified Forward Analysis" "08_spa_forward" >}}
|
* {{< draftlink "Forward Analysis" "08_spa_forward" >}}
|
||||||
|
* {{< draftlink "Verifying the Forward Analysis" "09_spa_verified_forward" >}}
|
||||||
|
|
|
@ -79,6 +79,7 @@ a less specific output! The more you know going in, the more you should know
|
||||||
coming out. Similarly, when given less specific / vaguer information, the
|
coming out. Similarly, when given less specific / vaguer information, the
|
||||||
analysis shouldn't produce a more specific answer -- how could it do that?
|
analysis shouldn't produce a more specific answer -- how could it do that?
|
||||||
This leads us to come up with the following rule:
|
This leads us to come up with the following rule:
|
||||||
|
{#define-monotonicity}
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\textbf{if}\ \text{input}_1 \le \text{input}_2,
|
\textbf{if}\ \text{input}_1 \le \text{input}_2,
|
||||||
|
|
424
content/blog/08_spa_agda_forward/index.md
Normal file
424
content/blog/08_spa_agda_forward/index.md
Normal file
|
@ -0,0 +1,424 @@
|
||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 8: Forward Analysis"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I use the monotone lattice framework and verified CFGs to define a sign analysis"
|
||||||
|
date: 2024-12-01T15:09:07-08:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
draft: true
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous post, I showed that the Control Flow graphs we built of our
|
||||||
|
programs match how they are really executed. This means that we can rely
|
||||||
|
on these graphs to compute program information. In this post, we finally
|
||||||
|
get to compute that information. Let's jump right into it!
|
||||||
|
|
||||||
|
### Choosing a Lattice
|
||||||
|
A lot of this time, we have been [talking about lattices]({{< relref "01_spa_agda_lattices" >}}),
|
||||||
|
particularly [lattices of finite height]({{< relref "03_spa_agda_fixed_height" >}}).
|
||||||
|
These structures represent things we know about the program, and provide operators
|
||||||
|
like \((\sqcup)\) and \((\sqcap)\) that help us combine such knowledge.
|
||||||
|
|
||||||
|
The forward analysis code I present here will work with any finite-height
|
||||||
|
lattice, with the additional constraint that equivalence of lattices
|
||||||
|
is decidable, which comes from [the implementation of the fixed-point algorithm]({{< relref "04_spa_agda_fixedpoint" >}}),
|
||||||
|
in which we routinely check if a function's output is the same as its input.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 4 8 >}}
|
||||||
|
|
||||||
|
The finite-height lattice `L` is intended to describe the state of a single
|
||||||
|
variable.
|
||||||
|
One example of a lattice that can be used as `L` is our
|
||||||
|
sign lattice. We've been using the sign lattice in our examples [from the very beginning]({{< relref "01_spa_agda_lattices#lattices" >}}),
|
||||||
|
and we will stick with it for the purposes of this explanation. However, this
|
||||||
|
lattice alone does not describe our program, since it only talks about a single
|
||||||
|
sign; programs have lots of variables, all of which can have different signs!
|
||||||
|
So, we might go one step further and define a map lattice from variables to
|
||||||
|
their signs:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Variable} \to \text{Sign}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We [have seen]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}})
|
||||||
|
that we can turn any lattice \(L\) into a map lattice \(A \to L\), for any
|
||||||
|
type of keys \(A\). In this case, we will define \(A \triangleq \text{Variable}\),
|
||||||
|
and \(L \triangleq \text{Sign}\). The
|
||||||
|
[sign lattice has a finite height]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}}),
|
||||||
|
and I've proven that, as long as we pick a finite set of keys, [map lattices
|
||||||
|
\(A \to L\) have a finite height if \(L\) has a finite height]({{< relref "03_spa_agda_fixed_height#fixed-height-of-the-map-lattice" >}}).
|
||||||
|
Since a program's text is finite, \(\text{Variable}\) is a finite set, and
|
||||||
|
we have ourselves a finite-height lattice \(\text{Variable} \to \text{Sign}\).
|
||||||
|
|
||||||
|
We're on the right track, but even the lattice we have so far is not sufficient.
|
||||||
|
That's because variables have different signs at different points in the program!
|
||||||
|
You might initialize a variable with `x = 1`, making it positive, and then
|
||||||
|
go on to compute some arbitrary function using loops and conditionals. For
|
||||||
|
each variable, we need to keep track of its sign at various points in the code.
|
||||||
|
When we [defined Control Flow Graphs]({{< relref "06_spa_agda_cfg" >}}), we
|
||||||
|
split our programs into sequences of statements that are guaranteed to execute
|
||||||
|
together --- basic blocks. For our analysis, we'll keep per-variable for
|
||||||
|
each basic block in the program. Since basic blocks are nodes in the Control Flow
|
||||||
|
Graph of our program, our whole lattice will be as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Info} \triangleq \text{NodeId} \to (\text{Variable} \to \text{Sign})
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We follow the same logic we just did for the variable-sign lattice; since
|
||||||
|
\(\text{Variable} \to \text{Sign}\) is a lattice of finite height, and since
|
||||||
|
\(\text{NodeId}\) is a finite set, the whole \(\text{Info}\) map will be
|
||||||
|
a lattice with a finite height.
|
||||||
|
|
||||||
|
Notice that both the sets of \(\text{Variable}\) and \(\text{NodeId}\) depend
|
||||||
|
on the program in question. The lattice we use is slightly different for
|
||||||
|
each input program! We can use Agda's parameterized modules to automaitcally
|
||||||
|
parameterize all our functions over programs:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 36 37 >}}
|
||||||
|
|
||||||
|
Now, let's make the informal descriptions above into code, by instantiating
|
||||||
|
our map lattice modules. First, I invoked the code for the smaller variable-sign
|
||||||
|
lattice. This ended up being quite long, so that I could rename variables I
|
||||||
|
brought into scope. I will collapse the relevant code block; suffice to say
|
||||||
|
that I used the suffix `v` (e.g., renaming `_⊔_` to `_⊔ᵛ_`) for properties
|
||||||
|
and operators to do with variable-sign maps (in Agda: `VariableValuesFiniteMap`).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 41 82 "" "**(Click here to expand the module uses for variable-sign maps)**" >}}
|
||||||
|
|
||||||
|
I then used this lattice as an argument to the map module again, to
|
||||||
|
construct the top-level \(\text{Info}\) lattice (in Agda: `StateVariablesFiniteMap`).
|
||||||
|
This also required a fair bit of code, most of it to do with renaming.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 85 112 "" "**(Click here to expand the module uses for the top-level lattice)**" >}}
|
||||||
|
|
||||||
|
### Constructing a Monotone Function
|
||||||
|
|
||||||
|
We now have a lattice in hand; the next step is to define a function over
|
||||||
|
this lattice. For us to be able to use the fixed-point algorithm on this
|
||||||
|
function, it will need to be [monotonic]({{< relref "01_spa_agda_lattices#define-monotonicity" >}}).
|
||||||
|
|
||||||
|
Our goal with static analysis is to compute information about our program; that's
|
||||||
|
what we want the function to do. When the lattice we're using is the sign lattice,
|
||||||
|
we're trying to determine the signs of each of the variables in various parts
|
||||||
|
of the program. How do we go about this?
|
||||||
|
|
||||||
|
Each piece of code in the program might change a variable's sign. For instance,
|
||||||
|
if `x` has sign \(0\), and we run the statement `x = x - 1`, the sign of
|
||||||
|
`x` will be \(-\). If we have an expression `y + z`, we can use the signs of
|
||||||
|
`y` and `z` to compute the sign of the whole thing. This is a form
|
||||||
|
of [abstract interpretation](https://en.wikipedia.org/wiki/Abstract_interpretation),
|
||||||
|
in which we almost-run the program, but forget some details (e.g., the
|
||||||
|
exact values of `x`, `y`, and `z`, leaving only their signs). The exact details
|
||||||
|
of how this partial evaluation is done are analysis-specific; in general, we
|
||||||
|
simply require an analysis to provide an evaluator. We will define
|
||||||
|
[an evaluator for the sign lattice below](#instantiating-with-the-sign-lattice).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 166 167 >}}
|
||||||
|
|
||||||
|
From this, we know how each statement and basic block will change variables
|
||||||
|
in the function. But we have described them process as "if a variable has
|
||||||
|
sign X, it becomes sign Y" -- how do we know what sign a variable has _before_
|
||||||
|
the code runs? Fortunately, the Control Flow Graph tells us exactly
|
||||||
|
what code could be executed before any given basic block. Recall that edges
|
||||||
|
in the graph describe all possible jumps that could occur; thus, for any
|
||||||
|
node, the incoming edges describe all possible blocks that can precede it.
|
||||||
|
This is why we spent all that time [defining the `predecessors` function]({{< relref "06_spa_agda_cfg#additional-functions" >}}).
|
||||||
|
|
||||||
|
We proceed as follows: for any given node, find its predecessors. By accessing
|
||||||
|
our \(\text{Info}\) map for each predecessor, we can determine our current
|
||||||
|
best guess of variable signs at that point, in the form of a \(\text{Variable} \to \text{Sign}\)
|
||||||
|
map (more generally, \(\text{Variable} \to L\) map in an arbitrary analysis).
|
||||||
|
We know that any of these predecessors could've been the previous point of
|
||||||
|
execution; if a variable `x` has sign \(+\) in one predecessor and \(-\)
|
||||||
|
in another, it can be either one or the other when we start executing the
|
||||||
|
current block. Early on, we saw that [the \((\sqcup)\) operator models disjunction
|
||||||
|
("A or B")]({{< relref "01_spa_agda_lattices#lub-glub-or-and" >}}). So, we apply
|
||||||
|
\((\sqcup)\) to the variable-sign maps of all predecessors. The
|
||||||
|
[reference _Static Program Analysis_ text](https://cs.au.dk/~amoeller/spa/)
|
||||||
|
calls this operation \(\text{JOIN}\):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\textit{JOIN}(v) = \bigsqcup_{w \in \textit{pred}(v)} \llbracket w \rrbracket
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The Agda implementation uses a `foldr`:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 139 140 >}}
|
||||||
|
|
||||||
|
Computing the "combined incoming states" for any node is a monotonic function.
|
||||||
|
This follows from the monotonicity of \((\sqcup)\) --- in both arguments ---
|
||||||
|
and the definition of `foldr`.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice.agda" 143 151 "" "**(Click here to expand the general proof)**" >}}
|
||||||
|
|
||||||
|
From this, we can formally state that \(\text{JOIN}\) is monotonic. Note that
|
||||||
|
the input and output lattices are different: the input lattice is the lattice
|
||||||
|
of variable states at each block, and the output lattice is a single variable-sign
|
||||||
|
map, representing the combined preceding state at a given node.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 145 149 >}}
|
||||||
|
|
||||||
|
Above, the `m₁≼m₂⇒m₁[ks]≼m₂[ks]` lemma states that for two maps with the same
|
||||||
|
keys, where one map is less than another, all the values for any subset of keys
|
||||||
|
`ks` are pairwise less than each other (i.e. `m₁[k]≼m₂[k]`, and `m₁[l]≼m₂[l]`, etc.).
|
||||||
|
This follows from the definition of "less than" for maps.
|
||||||
|
{#less-than-lemma}
|
||||||
|
|
||||||
|
So those are the two pieces: first, join all the preceding states, then use
|
||||||
|
the abstract interpretation function. I opted to do both of these in bulk:
|
||||||
|
|
||||||
|
1. Take an initial \(\text{Info}\) map, and update every basic block's entry
|
||||||
|
to be the join of its predecessors.
|
||||||
|
2. In the new joined map, each key now contains the variable state at
|
||||||
|
the beginning of the block; so, apply the abstract interpretation function
|
||||||
|
via `eval` to each key, computing the state at the end of the block.
|
||||||
|
|
||||||
|
I chose to do these in bulk because this way, after each application of
|
||||||
|
the function, we have updated each block with exactly one round of information.
|
||||||
|
The alternative --- which is specified in the reference text --- is to update
|
||||||
|
one key at a time. The difference there is that updates to later keys might be
|
||||||
|
"tainted" by updates to keys that came before them. This is probably fine
|
||||||
|
(and perhaps more efficient, in that it "moves faster"), but it's harder to
|
||||||
|
reason about.
|
||||||
|
|
||||||
|
#### Generalized Update
|
||||||
|
|
||||||
|
To implement bulk assignment, I needed to implement the source text's
|
||||||
|
Exercise 4.26:
|
||||||
|
|
||||||
|
> __Exercise 4.26__: Recall that \(f[a \leftarrow x]\) denotes the function that is identical to
|
||||||
|
> \(f\) except that it maps \(a\) to \(x\). Assume \(f : L_1 \to (A \to L_2)\)
|
||||||
|
> and \(g : L_1 \to L_2\) are monotone functions where \(L_1\) and \(L_2\) are
|
||||||
|
> lattices and \(A\) is a set, and let \(a \in A\). (Note that the codomain of
|
||||||
|
> \(f\) is a map lattice.)
|
||||||
|
>
|
||||||
|
> Show that the function \(h : L_1 \to (A \to L_2)\)
|
||||||
|
> defined by \(h(x) = f(x)[a \leftarrow g(x)]\) is monotone.
|
||||||
|
|
||||||
|
In fact, I generalized this statement to update several keys at once, as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
h(x) = f(x)[a_1 \leftarrow g(a_1, x),\ ...,\ a_n \leftarrow g(a_n, x)]
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
I called this operation "generalized update".
|
||||||
|
|
||||||
|
At first, the exercise may not obviously correspond to the bulk operation
|
||||||
|
I've described. Particularly confusing is the fact that it has two lattices,
|
||||||
|
\(L_1\) and \(L_2\). In fact, the exercise results in a very general theorem;
|
||||||
|
we can exploit a more concrete version of the theorem by setting
|
||||||
|
\(L_1 \triangleq A \to L_2\), resulting in an overal signature for \(f\) and \(h\):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
f : (A \to L_2) \to (A \to L_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In other words, if we give the entire operation in Exercise 4.26 a type,
|
||||||
|
it would look like this:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{ex}_{4.26} : \underbrace{K}_{\text{value of}\ a} \to \underbrace{(\text{Map} \to V)}_{\text{updater}} \to \underbrace{\text{Map} \to \text{Map}}_{f} \to \underbrace{\text{Map} \to \text{Map}}_{h}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That's still more general than we need it. This here allows us to modify
|
||||||
|
any map-to-map function by updating a certain key in that function. If we
|
||||||
|
_just_ want to update keys (as we do for the purposes of static analysis),
|
||||||
|
we can recover a simpler version by setting \(f \triangleq id\), which
|
||||||
|
results in an updater \(h(x) = x[a \leftarrow g(x)]\), and a signature for
|
||||||
|
the exercise:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{ex}_{4.26} : \underbrace{K}_{\text{value of}\ a} \to \underbrace{(\text{Map} \to V)}_{\text{updater}\ g} \to \underbrace{\text{Map}}_{\text{old map}} \to \underbrace{\text{Map}}_{\text{updated map}}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This looks just like Haskell's [`Data.Map.adjust` function](https://hackage.haskell.org/package/containers-0.4.0.0/docs/src/Data-Map.html#adjust), except that it
|
||||||
|
can take the entire map into consideration when updating a key.
|
||||||
|
|
||||||
|
My generalized version takes in a list of keys to update, and makes the updater
|
||||||
|
accept a key so that its behavior can be specialized for each entry it changes.
|
||||||
|
The sketch of the implementation is in the `_updating_via_` function from
|
||||||
|
the `Map` module, and its helper `transform`. Here, I collapse its definition,
|
||||||
|
since it's not particularly important.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 926 931 "" "**(Click here to see the definition of `transform`)**" >}}
|
||||||
|
|
||||||
|
The proof of monotonicity --- which is the solution to the exercise --- is
|
||||||
|
actually quite complicated. I will omit its description, and show it here
|
||||||
|
in another collapsed block.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 1042 1105 "" "**(Click here to see the proof of monotonicity of \(h\))**" >}}
|
||||||
|
|
||||||
|
Given a proof of the exercise, all that's left is to instantiate the theorem
|
||||||
|
with the argument I described. Specifically:
|
||||||
|
|
||||||
|
* \(L_1 \triangleq \text{Info} \triangleq \text{NodeId} \to (\text{Variable} \to \text{Sign})\)
|
||||||
|
* \(L_2 \triangleq \text{Variable} \to \text{Sign} \)
|
||||||
|
* \(A \triangleq \text{NodeId}\)
|
||||||
|
* \(f \triangleq \text{id} \triangleq x \mapsto x\)
|
||||||
|
* \(g(k, m) = \text{JOIN}(k, m)\)
|
||||||
|
|
||||||
|
In the equation for \(g\), I explicitly insert the map \(m\) instead of leaving
|
||||||
|
it implicit as the textbook does. In Agda, this instantiation for joining
|
||||||
|
all predecessor looks like this (using `states` as the list of keys to update,
|
||||||
|
indicating that we should update _every_ key):
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 152 157 >}}
|
||||||
|
|
||||||
|
And the one for evaluating all programs looks like this:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 215 220 >}}
|
||||||
|
|
||||||
|
Actually, we haven't yet seen that `updateVariablesFromStmt`. This is
|
||||||
|
a function that we can define using the user-provided abtract interpretation
|
||||||
|
`eval`. Specifically, it handles the job of updating the sign of a variable
|
||||||
|
once it has been assigned to (or doing nothing if the statement is a no-op).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 191 193 >}}
|
||||||
|
|
||||||
|
The `updateVariablesFromExpression` is now new, and it is yet another map update,
|
||||||
|
which changes the sign of a variable `k` to be the one we get from running
|
||||||
|
`eval` on it. Map updates are instances of the generalized update; this
|
||||||
|
time, the updater \(g\) is `eval`. The exercise requires the updater to be
|
||||||
|
monotonic, which constrains the user-provided evaluation function to be
|
||||||
|
monotonic too.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 173 181 >}}
|
||||||
|
|
||||||
|
We finally write the `analyze` function as the composition of the two bulk updates:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 226 232 >}}
|
||||||
|
|
||||||
|
### Instantiating with the Sign Lattice
|
||||||
|
Thus far, I've been talking about the sign lattice throughout, but implementing
|
||||||
|
the Agda code in terms of a general lattice `L` and evaluation function `eval`.
|
||||||
|
In order to actually run the Agda code, we do need to provide an `eval` function,
|
||||||
|
which implements the logic we used above, in which a zero-sign variable \(x\)
|
||||||
|
minus one was determined to be negative. For binary operators specifically,
|
||||||
|
I've used the table provided in the textbook; here they are:
|
||||||
|
|
||||||
|
{{< figure src="plusminus.png" caption="Cayley tables for abstract interpretation of plus and minus" >}}
|
||||||
|
|
||||||
|
These are pretty much common sense:
|
||||||
|
* A positive plus a positive is still positive, so \(+\ \hat{+}\ + = +\)
|
||||||
|
* A positive plus any sign could be any sign still, so \(+\ \hat{+}\ \top = \top\)
|
||||||
|
* Any sign plus "impossible" is impossible, so \(\top\ \hat{+} \bot = \bot\).
|
||||||
|
* etc.
|
||||||
|
|
||||||
|
The Agda encoding for the plus function is as follows, and the one for minus
|
||||||
|
is similar.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 76 94 >}}
|
||||||
|
|
||||||
|
As the comment in the block says, it would be incredibly tedious to verify
|
||||||
|
the monotonicity of these tables, since you would have to consider roughly
|
||||||
|
125 cases _per argument_: for each (fixed) sign \(s\) and two other signs
|
||||||
|
\(s_1 \le s_2\), we'd need to show that \(s\ \hat{+}\ s_1 \le s\ \hat{+}\ s_2\).
|
||||||
|
I therefore commit the _faux pas_ of using `postulate`. Fortunately, the proof
|
||||||
|
of monotonicity is not used for the execution of the program, so we will
|
||||||
|
get away with this, barring any meddling kids.
|
||||||
|
|
||||||
|
From this, all that's left is to show that for any expression `e`, the
|
||||||
|
evaluation function:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{eval} : \text{Expr} \to (\text{Variable} \to \text{Sign}) \to \text{Sign}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
is monotonic. It's defined straightforwardly and very much like an evaluator /
|
||||||
|
interpreter, suggesting that "abstract interpretation" is the correct term here.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 176 184 >}}
|
||||||
|
|
||||||
|
Thought it won't happen, it was easier to just handle the case where there's
|
||||||
|
an undefined variable; I give it "any sign". Otherwise, the function simply
|
||||||
|
consults the sign tables for `+` or `-`, as well as the known signs of the
|
||||||
|
variables. For natural number literals, it assigns `0` the "zero" sign, and
|
||||||
|
any other natural number the "\(+\)".
|
||||||
|
|
||||||
|
To prove monotonicity, we need to consider two variable maps (one less than
|
||||||
|
the other), and show that the abstract interpretation respects that ordering.
|
||||||
|
This boils down to the fact that the `plus` and `minus` tables are monotonic
|
||||||
|
in both arguments (thus, if their sub-expressions are evaluated monotonically
|
||||||
|
given an environment, then so is the whole addition or subtraction), and
|
||||||
|
to the fact that for two maps `m₁ ≼ m₂`, the values at corresponding keys
|
||||||
|
are similarly ordered: `m₁[k] ≼ m₂[k]`. We [saw that above](#less-than-lemma).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 186 223 "" "**(Click to expand the proof that the evaluation function for signs is monotonic)**" >}}
|
||||||
|
|
||||||
|
That's all we need. With this, I just instantiate the `Forward` module we have
|
||||||
|
been working with, and make use of the `result`. I also used a `show`
|
||||||
|
function (which I defined) to stringify that output.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 225 229 >}}
|
||||||
|
|
||||||
|
But wait, `result`? We haven't seen a result just yet. That's the last piece,
|
||||||
|
and it involves finally making use of the fixed-point algorithm.
|
||||||
|
|
||||||
|
### Invoking the Fixed Point Algorithm
|
||||||
|
Our \(\text{Info}\) lattice is of finite height, and the function we have defined
|
||||||
|
is monotonic (by virtue of being constructed only from map updates, which
|
||||||
|
are monotonic by Exercise 4.26, and from function composition, which preserves
|
||||||
|
monotonicity). We can therefore apply the fixed-point-algorithm, and compute
|
||||||
|
the least fixed point:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 235 238 >}}
|
||||||
|
|
||||||
|
With this, `analyze` is the result of our forward analysis!
|
||||||
|
|
||||||
|
In a `Main.agda` file, I invoked this analysis on a sample program:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
testCode : Stmt
|
||||||
|
testCode =
|
||||||
|
⟨ "zero" ← (# 0) ⟩ then
|
||||||
|
⟨ "pos" ← ((` "zero") Expr.+ (# 1)) ⟩ then
|
||||||
|
⟨ "neg" ← ((` "zero") Expr.- (# 1)) ⟩ then
|
||||||
|
⟨ "unknown" ← ((` "pos") Expr.+ (` "neg")) ⟩
|
||||||
|
|
||||||
|
testProgram : Program
|
||||||
|
testProgram = record
|
||||||
|
{ rootStmt = testCode
|
||||||
|
}
|
||||||
|
|
||||||
|
open WithProg testProgram using (output; analyze-correct)
|
||||||
|
|
||||||
|
main = run {0ℓ} (putStrLn output)
|
||||||
|
```
|
||||||
|
|
||||||
|
The result is verbose, since it shows variable signs for each statement
|
||||||
|
in the program. However, the key is the last basic block, which shows
|
||||||
|
the variables at the end of the program. It reads:
|
||||||
|
|
||||||
|
```
|
||||||
|
{"neg" ↦ -, "pos" ↦ +, "unknown" ↦ ⊤, "zero" ↦ 0, }
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verifying the Analysis
|
||||||
|
We now have a general framework for running forward analyses: you provide
|
||||||
|
an abstract interpretation function for expressions, as well as a proof
|
||||||
|
that this function is monotonic, and you get an Agda function that takes
|
||||||
|
a program and tells you the variable states at every point. If your abstract
|
||||||
|
interpretation function is for determining the signs of expressions, the
|
||||||
|
final result is an analysis that determines all possible signs for all variables,
|
||||||
|
anywhere in the code. It's pretty easy to instantiate this framework with
|
||||||
|
another type of forward analysis --- in fact, by switching the
|
||||||
|
`plus` function to one that uses `AboveBelow ℤ`, rather than `AboveBelow Sign`:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
plus : ConstLattice → ConstLattice → ConstLattice
|
||||||
|
plus ⊥ᶜ _ = ⊥ᶜ
|
||||||
|
plus _ ⊥ᶜ = ⊥ᶜ
|
||||||
|
plus ⊤ᶜ _ = ⊤ᶜ
|
||||||
|
plus _ ⊤ᶜ = ⊤ᶜ
|
||||||
|
plus [ z₁ ]ᶜ [ z₂ ]ᶜ = [ z₁ Int.+ z₂ ]ᶜ
|
||||||
|
```
|
||||||
|
|
||||||
|
we can defined a constant-propagation analysis.
|
||||||
|
|
||||||
|
However, we haven't proved our analysis correct, and we haven't yet made use of
|
||||||
|
the CFG-semantics equivalence that we
|
||||||
|
[proved in the previous section]({{< relref "07_spa_agda_semantics_and_cfg" >}}).
|
||||||
|
I was hoping to get to it in this post, but there was just too much to
|
||||||
|
cover. So, I will get to that in the next post, where we will make use
|
||||||
|
of the remaining machinery to demonstrate that the output of our analyzer
|
||||||
|
matches reality.
|
BIN
content/blog/08_spa_agda_forward/plusminus.png
Normal file
BIN
content/blog/08_spa_agda_forward/plusminus.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 42 KiB |
Loading…
Reference in New Issue
Block a user