505 lines
25 KiB
Markdown
505 lines
25 KiB
Markdown
---
|
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
|
series: "Static Program Analysis in Agda"
|
|
description: "In this post, I introduce an algebraic structure called a lattice, which underpins certain program analyses"
|
|
date: 2024-07-06T17:37:43-07:00
|
|
tags: ["Agda", "Programming Languages"]
|
|
---
|
|
|
|
This is the first post in a series on
|
|
[static program analysis in Agda]({{< relref "static-program-analysis-in-agda" >}}).
|
|
See the [introduction]({{< relref "00_spa_agda_intro" >}}) for a little bit
|
|
more context.
|
|
|
|
The goal of this post is to motivate the algebraic structure called a
|
|
[lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Lattices have
|
|
{{< sidenote "right" "crdt-note" "broad applications" >}}
|
|
See, for instance, Lars Hupel's excellent
|
|
<a href="https://lars.hupel.info/topics/crdt/01-intro/">introduction to CRDTs</a>
|
|
which uses lattices for Conflict-Free Replicated Data Types. CRDTs can be
|
|
used to implement peer-to-peer distributed systems.
|
|
{{< /sidenote >}} beyond static program analysis, so the work in this post is
|
|
interesting in its own right. However, for the purposes of this series, I'm
|
|
most interested in lattices as an encoding of program information when performing
|
|
analysis. To start motivating lattices in that context, I'll need to start
|
|
with _monotone frameworks_.
|
|
|
|
### Monotone Frameworks
|
|
|
|
The key notion for monotone frameworks is the "specificity" of information.
|
|
Take, for instance, an analyzer that tries to figure out if a variable is
|
|
positive, negative, or equal to zero (this is called a _sign analysis_, and
|
|
we'll be using this example a lot). Of course, the variable could be "none
|
|
of the above" -- perhaps if it was initialized from user input, which would
|
|
allow both positive and negative numbers. Such an analyzer might return
|
|
`+`, `-`, `0`, or `unknown` for any given variable. These outputs are not
|
|
created equal: if a variable has sign `+`, we know more about it than if
|
|
the sign is `unknown`: we've ruled out negative numbers as possible values!
|
|
|
|
Specificity is important to us because we want our analyses to be as precise
|
|
as possible. It would be valid for a program analysis to just return
|
|
`unknown` for everything, but it wouldn't be very useful. Thus, we want to
|
|
rank possible outputs, and try pick the most specific one. The
|
|
{{< sidenote "right" "convention-note" "convention" -12 >}}
|
|
I say convention, because it doesn't actually matter if we represent more
|
|
specific values as "larger" or "smaller". Given a lattice with a particular
|
|
order written as <code><</code>, we can flip the sign in all relations
|
|
(turning <code>a < b</code> into <code>a > b</code>), and get back another
|
|
lattice. This lattice will have the same properties (more precisely,
|
|
the properties will be
|
|
<a href="https://en.wikipedia.org/wiki/Duality_(mathematics)">dual</a>). So
|
|
we shouldn't fret about picking a direction for "what's less than what".
|
|
{{< /sidenote >}}
|
|
seems to be to make
|
|
{{< sidenote "right" "order-note" "more specific things \"smaller\"" 1 >}}
|
|
Admittedly, it's a little bit odd to say that something which is "more" than
|
|
something else is actually smaller. The intuition that I favor is that
|
|
something that's more specific describes fewer objects: there are less
|
|
white horses than horses, so "white horse" is more specific than "horse".
|
|
The direction of <code><</code> can be thought of as comparing the number
|
|
of objects.<br>
|
|
<br>
|
|
Note that this is only an intuition; there are equally many positive and
|
|
negative numbers, but we will <em>not</em> group them together
|
|
in our order.
|
|
{{< /sidenote >}},
|
|
and less specific things "larger". Coming back to our previous example, we'd
|
|
write `+ < unknown`, since `+` is more specific. Of course, the exact
|
|
things we're trying to rank depend on the sort of analysis we're trying to
|
|
perform. Since I introduced sign analysis, we're ranking signs like `+` and `-`.
|
|
For other analyses, the elements will be different. The _comparison_, however,
|
|
will be a permanent fixture.
|
|
{#specificity}
|
|
|
|
Suppose now that we have some program analysis, and we're feeding it some input
|
|
information. Perhaps we're giving it the signs of variables `x` and `y`, and
|
|
hoping for it to give us the sign of a third variable `z`. It would be very
|
|
unfortunate if, when given more specific information, the analysis would return
|
|
a less specific output! The more you know going in, the more you should know
|
|
coming out. Similarly, when given less specific / vaguer information, the
|
|
analysis shouldn't produce a more specific answer -- how could it do that?
|
|
This leads us to come up with the following rule:
|
|
{#define-monotonicity}
|
|
|
|
{{< latex >}}
|
|
\textbf{if}\ \text{input}_1 \le \text{input}_2,
|
|
\textbf{then}\ \text{analyze}(\text{input}_1) \le \text{analyze}(\text{input}_2)
|
|
{{< /latex >}}
|
|
|
|
In mathematics, such a property is called _monotonicity_. We say that
|
|
"analyze" is a [monotonic function](https://en.wikipedia.org/wiki/Monotonic_function).
|
|
This property gives its name to monotone frameworks. For our purposes, this
|
|
property means that being more specific "pays off": better information in
|
|
means better information out. In Agda, we can encode monotonicity as follows:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 17 21 >}}
|
|
|
|
Note that above, I defined `Monotonic` on an arbitrary function, whose
|
|
outputs might be of a different type than its inputs. This will come in handy
|
|
later.
|
|
|
|
The order `<` of our elements and the monotonicity of our analysis are useful
|
|
to us for another reason: they help gauge and limit, in a roundabout way, how much
|
|
work might be left for our analysis to do. This matters because we don't want
|
|
to allow analyses that can take forever to finish -- that's a little too long
|
|
for a pragmatic tool used by people.
|
|
|
|
The key observation -- which I will describe in detail in a later post --
|
|
is that a monotonic analysis, in a way, "climbs upwards" through an
|
|
order. As we continue using this analysis to refine information over and over,
|
|
its results get
|
|
{{< sidenote "right" "less-specific-note" "less and less specific." >}}
|
|
It is not a bad thing for our results to get less specific over time, because
|
|
our initial information is probably incomplete. If you've only seen German
|
|
shepherds in your life, that might be your picture of what a dog is like.
|
|
If you then come across a chihuahua, your initial definition of "dog" would
|
|
certainly not accommodate it. To allow for both German shepherds and chihuahuas,
|
|
you'd have to loosen the definition of "dog". This new definition would be less
|
|
specific, but it would be more accurate.
|
|
{{< /sidenote >}}
|
|
If we add an additional ingredient, and say that the order has a _fixed height_,
|
|
we can deduce that the analysis will eventually stop producing additional
|
|
information: either it will keep "climbing", and reach the top (thus having
|
|
to stop), or it will stop on its own before reaching the top. This is
|
|
the essence of the fixed-point algorithm, which in Agda-like pseudocode can
|
|
be stated as follows:
|
|
|
|
```Agda
|
|
module _ (IsFiniteHeight A ≺)
|
|
(f : A → A)
|
|
(Monotonicᶠ : Monotonic _≼_ _≼_ f) where
|
|
-- There exists a point...
|
|
aᶠ : A
|
|
|
|
-- Such that applying the monotonic function doesn't change the result.
|
|
aᶠ≈faᶠ : aᶠ ≈ f aᶠ
|
|
```
|
|
|
|
Moreover, the value we'll get out of the fixed point algorithm will be
|
|
the _least fixed point_. For us, this means that the result will be
|
|
"the most specific result possible".
|
|
|
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 86 86 >}}
|
|
|
|
The above explanation omits a lot of details, but it's a start. To get more
|
|
precise, we must drill down into several aspects of what I've said so far.
|
|
The first of them is, __how can we compare program information using an order?__
|
|
|
|
### Lattices
|
|
|
|
Let's start with a question: when it comes to our specificity-based order,
|
|
is `-` less than, greater than, or equal to `+`? Surely it's not less specific;
|
|
knowing that a number is negative doesn't give you less information than
|
|
knowing if that number is positive. Similarly, it's not any more specific, for
|
|
the same reason. You could consider it equally specific, but that doesn't
|
|
seem quite right either; the information is different, so comparing specificity
|
|
feels apples-to-oranges. On the other hand, both `+` and `-` are clearly
|
|
more specific than `unknown`.
|
|
|
|
The solution to this conundrum is to simply refuse to compare certain elements:
|
|
`+` is neither less than, greater than, nor equal to `-`, but `+ < unknown` and
|
|
`- < unknown`. Such an ordering is called a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set).
|
|
|
|
Next, another question. Suppose that the user writes code like this:
|
|
|
|
```
|
|
if someCondition {
|
|
x = exprA;
|
|
} else {
|
|
x = exprB;
|
|
}
|
|
y = x;
|
|
```
|
|
|
|
If `exprA` has sign `s1`, and `exprB` has sign `s2`, what's the sign of `y`?
|
|
It's not necessarily `s1` nor `s2`, since they might not match: `s1` could be `+`,
|
|
and `s2` could be `-`, and using either `+` or `-` for `y` would be incorrect.
|
|
We're looking for something that can encompass _both_ `s1` and `s2`.
|
|
Necessarily, it would be either equally specific or less specific than
|
|
either `s1` or `s2`: there isn't any new information coming in about `x`,
|
|
and since we don't know which branch is taken, we stand to lose a little
|
|
bit of info. However, our goal is always to maximize specificity, since
|
|
more specific signs give us more information about our program.
|
|
|
|
This gives us the following constraints. Since the combined sign `s` has to
|
|
be equally or less specific than either `s1` and `s2`, we have `s1 <= s` and
|
|
`s2 <= s`. However, we want to pick `s` such that it's more specific than
|
|
any other "combined sign" candidate. Thus, if there's another sign `t`,
|
|
with `s1 <= t` and `s2 <= t`, then it must be less specific than `s`: `s <= t`.
|
|
|
|
At first, the above constraints might seem quite complicated. We can interpret
|
|
them in more familiar territory by looking at numbers instead of signs.
|
|
If we have two numbers `n1` and `n2`, what number is the smallest number
|
|
that's bigger than either `n1` or `n2`? Why, the maximum of the two, of course!
|
|
|
|
There is a reason why I used the constraints above instead of just saying
|
|
"maximum". For numbers, `max(a,b)` is either `a` or `b`. However, we saw earlier
|
|
that neither `+` nor `-` works as the sign for `y` in our program. Moreover,
|
|
we agreed above that our order is _partial_: how can we pick "the bigger of two
|
|
elements" if neither is bigger than the other? `max` itself doesn't quite work,
|
|
but what we're looking for is something similar. Instead, we simply require a
|
|
similar function for our signs. We call this function "[least upper bound](https://en.wikipedia.org/wiki/Least-upper-bound_property)",
|
|
since it is the "least (most specific)
|
|
element that's greater (less specific) than either `s1` or `s2`". Conventionally,
|
|
this function is written as \(a \sqcup b\) (or in our case, \(s_1 \sqcup s_2\)).
|
|
The \((\sqcup)\) symbol is also called the _join_ of \(a\) and \(b\).
|
|
We can define it for our signs so far using the following [Cayley table](https://en.wikipedia.org/wiki/Cayley_table).
|
|
{#least-upper-bound}
|
|
|
|
{{< latex >}}
|
|
\begin{array}{c|cccc}
|
|
\sqcup & - & 0 & + & ? \\
|
|
\hline
|
|
- & - & ? & ? & ? \\
|
|
0 & ? & 0 & ? & ? \\
|
|
+ & ? & ? & + & ? \\
|
|
? & ? & ? & ? & ? \\
|
|
\end{array}
|
|
{{< /latex >}}
|
|
|
|
By using the above table, we can see that \((+\ \sqcup\ -)\ =\ ?\) (aka `unknown`).
|
|
This is correct; given the four signs we're working with, that's the most we can say.
|
|
Let's explore the analogy to the `max` function a little bit more, by observing
|
|
that this function has certain properties:
|
|
|
|
* `max(a, a) = a`. The maximum of one number is just that number.
|
|
Mathematically, this property is called _idempotence_. Note that
|
|
by inspecting the diagonal of the above table, we can confirm that our
|
|
\((\sqcup)\) function is idempotent.
|
|
* `max(a, b) = max(b, a)`. If you're taking the maximum of two numbers,
|
|
it doesn't matter which one you consider first. This property is called
|
|
_commutativity_. Note that if you mirror the table along the diagonal,
|
|
it doesn't change; this shows that our \((\sqcup)\) function is
|
|
commutative.
|
|
* `max(a, max(b, c)) = max(max(a, b), c)`. When you have three numbers,
|
|
and you're determining the maximum value, it doesn't matter which pair of
|
|
numbers you compare first. This property is called _associativity_. You
|
|
can use the table above to verify the \((\sqcup)\) is associative, too.
|
|
|
|
A set that has a binary operation (like `max` or \((\sqcup)\)) that
|
|
satisfies the above properties is called a [semilattice](https://en.wikipedia.org/wiki/Semilattice). In Agda, we can write this definition roughly as follows:
|
|
|
|
```Agda
|
|
record IsSemilattice {a} (A : Set a) (_⊔_ : A → A → A) : Set a where
|
|
field
|
|
⊔-assoc : (x y z : A) → ((x ⊔ y) ⊔ z) ≡ (x ⊔ (y ⊔ z))
|
|
⊔-comm : (x y : A) → (x ⊔ y) ≡ (y ⊔ x)
|
|
⊔-idemp : (x : A) → (x ⊔ x) ≡ x
|
|
```
|
|
|
|
Note that this is an example of the ["Is Something" pattern]({{< relref "agda_is_pattern" >}}).
|
|
It turns out to be convenient, however, to not require definitional equality
|
|
(`≡`). For instance, we might model sets as lists. Definitional equality
|
|
would force us to consider lists with the same elements but a different
|
|
order to be unequal. Instead, we parameterize our definition of `IsSemilattice`
|
|
by a binary relation `_≈_`, which we ask to be an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation).
|
|
{#definitional-equality}
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 23 39 >}}
|
|
|
|
Notice that the above code also provides -- but doesn't require -- `_≼_` and
|
|
`_≺_`. That's because a least-upper-bound operation encodes an order:
|
|
intuitively, if `max(a, b) = b`, then `b` must be larger than `a`.
|
|
Lars Hupel's CRDT series includes [an explanation](https://lars.hupel.info/topics/crdt/03-lattices/#there-) of how the ordering operator
|
|
and the "least upper bound" function can be constructed from one another.
|
|
|
|
As it turns out, the `min` function has very similar properties to `max`:
|
|
it's idempotent, commutative, and associative. For a partial order like
|
|
ours, the analog to `min` is "greatest lower bound", or "the largest value
|
|
that's smaller than both inputs". Such a function is denoted as \(a\sqcap b\),
|
|
and often called the "meet" of \(a\) and \(b\).
|
|
As for what it means, where \(s_1 \sqcup s_2\) means "combine two signs where
|
|
you don't know which one will be used" (like in an `if`/`else`),
|
|
\(s_1 \sqcap s_2\) means "combine two signs where you know
|
|
{{< sidenote "right" "or-join-note" "both of them to be true" -7 >}}
|
|
If you're familiar with <a href="https://en.wikipedia.org/wiki/Boolean_algebra">
|
|
Boolean algebra</a>, this might look a little bit familiar to you. In fact,
|
|
the symbol for "and" on booleans is \(\land\). Similarly, the symbol
|
|
for "or" is \(\lor\). So, \(s_1 \sqcup s_2\) means "the sign is \(s_1\) or \(s_2\)",
|
|
or "(the sign is \(s_1\)) \(\lor\) (the sign is \(s_2\))". Similarly,
|
|
\(s_1 \sqcap s_2\) means "(the sign is \(s_1\)) \(\land\) (the sign is \(s_2\))".
|
|
Don't these symbols look similar?<br>
|
|
<br>
|
|
In fact, booleans with \((\lor)\) and \((\land)\) satisfy the semilattice
|
|
laws we've been discussing, and together form a lattice (to which I'm building
|
|
to in the main body of the text). The same is true for the set union and
|
|
intersection operations, \((\cup)\) and \((\cap)\).
|
|
{{< /sidenote >}}". For example, \((+\ \sqcap\ ?)\ =\ +\), because a variable
|
|
that's both "any sign" and "positive" must be positive.
|
|
{#lub-glub-or-and}
|
|
|
|
There's just one hiccup: what's the greatest lower bound of `+` and `-`?
|
|
it needs to be a value that's less than both of them, but so far, we don't have
|
|
such a value. Intuitively, this value should be called something like `impossible`,
|
|
because a number that's both positive and negative doesn't exist. So, let's
|
|
extend our analyzer to have a new `impossible` value. In fact, it turns
|
|
out that this "impossible" value is the least element of our set (we added
|
|
it to be the lower bound of `+` and co., which in turn are less than `unknown`).
|
|
Similarly, `unknown` is the largest element of our set, since it's greater
|
|
than `+` and co, and transitively greater than `impossible`. In mathematics,
|
|
it's not uncommon to define the least element as \(\bot\) (read "bottom"), and the
|
|
greatest element as \(\top\) (read "top"). With that in mind, the
|
|
following are the updated Cayley tables for our operations.
|
|
|
|
{{< latex >}}
|
|
\begin{array}{c|ccccc}
|
|
\sqcup & - & 0 & + & \top & \bot \\
|
|
\hline
|
|
- & - & \top & \top & \top & - \\
|
|
0 & \top & 0 & \top & \top & 0 \\
|
|
+ & \top & \top & + & \top & + \\
|
|
\top & \top & \top & \top & \top & \top \\
|
|
\bot & - & 0 & + & \top & \bot \\
|
|
\end{array}
|
|
|
|
\qquad
|
|
|
|
\begin{array}{c|ccccc}
|
|
\sqcap & - & 0 & + & \top & \bot \\
|
|
\hline
|
|
- & - & \bot & \bot & - & \bot \\
|
|
0 & \bot & 0 & \bot & 0 & \bot \\
|
|
+ & \bot & \bot & + & + & \bot \\
|
|
\top & - & 0 & + & \top & \bot \\
|
|
\bot & \bot & \bot & \bot & \bot & \bot \\
|
|
\end{array}
|
|
{{< /latex >}}
|
|
{#sign-lattice}
|
|
|
|
|
|
So, it turns out that our set of possible signs is a semilattice in two
|
|
ways. And if "semi" means "half", does two "semi"s make a whole? Indeed it does!
|
|
|
|
A lattice is made up of two semilattices. The operations of these two lattices,
|
|
however, must satisfy some additional properties. Let's examine the properties
|
|
in the context of `min` and `max` as we have before. They are usually called
|
|
the _absorption laws_:
|
|
{#absorption-laws}
|
|
|
|
* `max(a, min(a, b)) = a`. `a` is either less than or bigger than `b`;
|
|
so if you try to find the maximum __and__ the minimum of `a` and
|
|
`b`, one of the operations will return `a`.
|
|
* `min(a, max(a, b)) = a`. The reason for this one is the same as
|
|
the reason above.
|
|
|
|
In Agda, we can therefore write a lattice as follows:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 183 193 >}}
|
|
|
|
### Concrete Examples
|
|
#### Natural Numbers
|
|
|
|
Since we've been talking about `min` and `max` as motivators for properties
|
|
of \((\sqcap)\) and \((\sqcup)\), it might not be all that surprising
|
|
that natural numbers form a lattice with `min` and `max` as the two binary
|
|
operations. In fact, the Agda standard library writes `min` as `_⊓_` and
|
|
`max` as `_⊔_`! We can make use of the already-proven properties of these
|
|
operators to easily define `IsLattice` for natural numbers. Notice that
|
|
since we're not doing anything clever, like considering lists up to reordering,
|
|
there's no reason not to use definitional equality `≡` for our equivalence relation.
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/Nat.agda" 1 45 >}}
|
|
|
|
The definition for the lattice instance itself is pretty similar; I'll omit
|
|
it here to avoid taking up a lot of vertical space, but you can find it
|
|
on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}).
|
|
|
|
#### The "Above-Below" Lattice
|
|
It's not too hard to implement our sign lattice in Agda. However, we can
|
|
do it in a somewhat general way. As it turns out, extending an existing set,
|
|
such as \(\{+, -, 0\}\), with a "bottom" and "top" element (to be used
|
|
when taking the least upper bound and greatest lower bound) is quite common
|
|
and useful. For instance, if we were to do constant propagation (simplifying
|
|
`7+4` to `11`), we would probably do something similar, using the set
|
|
of integers \(\mathbb{Z}\) instead of the plus-zero-minus set.
|
|
|
|
The general definition is as follows. Take some original set \(S\) (like our 3-element
|
|
set of signs), and extend it with new "top" and "bottom" elements (\(\top\) and
|
|
\(\bot\)). Then, define \((\sqcup)\) as follows:
|
|
|
|
{{< latex >}}
|
|
x_1 \sqcup x_2 =
|
|
\begin{cases}
|
|
\top & x_1 = \top\ \text{or}\ x_2 = \top \\
|
|
\top & x_1, x_2 \in S, x_1 \neq x_2 \\
|
|
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
|
|
x_1 & x_2 = \bot \\
|
|
x_2 & x_1 = \bot
|
|
\end{cases}
|
|
{{< /latex >}}
|
|
|
|
In other words, \(\top\) overrules anything that it's combined with. In
|
|
math terms, it's the __absorbing element__ of the lattice. On the other hand,
|
|
\(\bot\) gets overruled by anything it's combined with. In math terms, that's
|
|
an __identity element__. Finally, when combining two elements that _aren't_
|
|
\(\top\) or \(\bot\) (which would otherwise be covered by the prior sentences),
|
|
combining an element with itself leaves it unchanged (upholding idempotence),
|
|
while combining two unequal element results in \(\top\). That last part
|
|
matches the way we defined "least upper bound" earlier.
|
|
|
|
The intuition is as follows: the \((\sqcup)\) operator is like an "or". Then,
|
|
"anything or positive" means "anything"; same with "anything or negative", etc.
|
|
On the other hand, "impossible or positive" means positive, since one of those
|
|
cases will never happen. Finally, in the absense of additional elements, the
|
|
most we can say about "positive or negative" is "any sign"; of course,
|
|
"positive or positive" is the same as "positive".
|
|
|
|
|
|
The "greatest lower bound" operator is defined by effectively swapping top
|
|
and bottom.
|
|
|
|
{{< latex >}}
|
|
x_1 \sqcup x_2 =
|
|
\begin{cases}
|
|
\bot & x_1 = \bot\ \text{or}\ x_2 = \bot \\
|
|
\bot & x_1, x_2 \in S, x_1 \neq x_2 \\
|
|
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
|
|
x_1 & x_2 = \top \\
|
|
x_2 & x_1 = \top
|
|
\end{cases}
|
|
{{< /latex >}}
|
|
|
|
For this operator, \(\bot\) is the absorbing element, and \(\top\) is the
|
|
identity element. The intuition here is not too different: if
|
|
\((\sqcap)\) is like an "and", then "impossible and positive" can't happen;
|
|
same with "impossible and negative", and so on. On the other hand,
|
|
"anything and positive" clearly means positive. Finally, "negative and positive"
|
|
can't happen (again, there is no number that's both positive and negative),
|
|
and "positive and positive" is just "positive".
|
|
|
|
What properties of the underlying set did we use to get this to work? The
|
|
only thing we needed is to be able to check and see if two elements are
|
|
equal or not; this is called _decidable equality_. Since that's the only
|
|
thing we used, this means that we can define an "above/below" lattice like this
|
|
for any type for which we can check if two elements are equal. In Agda, I encoded
|
|
this using a [parameterized module](https://agda.readthedocs.io/en/latest/language/module-system.html#parameterised-modules):
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 5 8 >}}
|
|
|
|
From there, I defined the actual data type as follows:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 23 26 >}}
|
|
|
|
From there, I defined the \((\sqcup)\) and \((\sqcap)\) operations almost
|
|
exactly to the mathematical equation above (the cases were re-ordered to
|
|
improve Agda's reduction behavior). Here's the former:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 86 93 >}}
|
|
|
|
And here's the latter:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 181 188 >}}
|
|
|
|
The proofs of the lattice properties are straightforward and proceed
|
|
by simple case analysis. Unfortunately, Agda doesn't quite seem to
|
|
evaluate the binary operator in every context that I would expect it to,
|
|
which has led me to define some helper lemmas such as the following:
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 95 96 >}}
|
|
|
|
As a sample, here's a proof of commutativity of \((\sqcup)\):
|
|
|
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 158 165 >}}
|
|
|
|
The details of the rest of the proofs can be found in the
|
|
[`AboveBelow.agda` file]({{< codeurl "agda-spa/Lattice/AboveBelow.agda" >}}).
|
|
|
|
To recover the sign lattice we've been talking about all along, it's sufficient
|
|
to define a sign data type:
|
|
|
|
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 19 22 >}}
|
|
|
|
Then, prove decidable equality on it (effecitly defining a comparison function),
|
|
and instantiate the `AboveBelow` module:
|
|
|
|
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 34 47 >}}
|
|
|
|
### From Simple Lattices to Complex Ones
|
|
|
|
Natural numbers and signs alone are cool enough, but they will not be sufficient
|
|
to write program analyzers. That's because when we're writing an analyzer,
|
|
we don't just care about one variable: we care about all of them! An
|
|
initial guess might be to say that when analyzing a program, we really need
|
|
_several_ signs: one for each variable. This might be reminiscent of a
|
|
[map](https://en.wikipedia.org/wiki/Associative_array). So, when we compare
|
|
specificity, we'll really be comparing the specificity of maps. Even that,
|
|
though, is not enough. The reason is that variables might have different
|
|
signs at different points in the program! A single map would not be able to
|
|
capture that sort of nuance, so what we really need is a map associating
|
|
states with another map, which in turn associates variables with their signs.
|
|
|
|
Mathematically, we might write this as:
|
|
|
|
{{< latex >}}
|
|
\text{Info} \triangleq \text{ProgramStates} \to (\text{Variables} \to \text{Sign})
|
|
{{< /latex >}}
|
|
|
|
That's a big step up in complexity. We now have a doubly-nested map structure
|
|
instead of just a sign. and we need to compare such maps in order to gaugage
|
|
their specificity and advance our analyses. But where do we even start with
|
|
maps, and how do we define the \((\sqcup)\) and \((\sqcap)\) operations?
|
|
|
|
The solution turns out to be to define ways in which simpler lattices
|
|
(like our sign) can be combined and transformed to define more complex lattices.
|
|
We'll move on to that in the next post of this series.
|