Flesh out the Lattices post some more

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
Danila Fedorin 2024-05-14 21:43:12 -07:00
parent c036041339
commit 60d3b3025a

View File

@ -335,7 +335,8 @@ In Agda, we can therefore write a lattice as follows:
{{< codelines "Agda" "agda-spa/Lattice.agda" 153 163 >}} {{< codelines "Agda" "agda-spa/Lattice.agda" 153 163 >}}
### Concrete Example: Natural Numbers ### Concrete Examples
#### Natural Numbers
Since we've been talking about `min` and `max` as motivators for properties Since we've been talking about `min` and `max` as motivators for properties
of \((\sqcap)\) and \((\sqcup)\), it might not be all that surprising of \((\sqcap)\) and \((\sqcup)\), it might not be all that surprising
@ -351,3 +352,141 @@ there's no reason not to use definitional equality `≡` for our equivalence rel
The definition for the lattice instance itself is pretty similar; I'll omit The definition for the lattice instance itself is pretty similar; I'll omit
it here to avoid taking up a lot of vertical space, but you can find it it here to avoid taking up a lot of vertical space, but you can find it
on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}). on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}).
#### The "Above-Below" Lattice
It's not too hard to implement our sign lattice in Agda. However, we can
do it in a somewhat general way. As it turns out, extending an existing set,
such as \(\{+, -, 0\}\), with a "bottom" and "top" element (to be used
when taking the least upper bound and greatest lower bound) is quite common
and useful. For instance, if we were to do constant propagation (simplifying
`7+4` to `11`), we would probably do something similar, using the set
of integers \(\mathbb{Z}\) instead of the plus-zero-minus set.
The general definition is as follows. Take some original set \(S\) (like our 3-element
set of signs), and extend it with new "top" and "bottom" elements (\(\top\) and
\(\bot\)). Then, define \((\sqcup)\) as follows:
{{< latex >}}
x_1 \sqcup x_2 =
\begin{cases}
\top & x_1 = \top\ \text{or}\ x_2 = \top \\
\top & x_1, x_2 \in S, x_1 \neq x_2 \\
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
x_1 & x_2 = \bot \\
x_2 & x_1 = \bot
\end{cases}
{{< /latex >}}
In other words, \(\top\) overrules anything that it's combined with. In
math terms, it's the __absorbing element__ of the lattice. On the other hand,
\(\bot\) gets overruled by anything it's combined with. In math terms, that's
an __identity element__. Finally, when combining two elements that _aren't_
\(\top\) or \(\bot\) (which would otherwise be covered by the prior sentences),
combining an element with itself leaves it unchanged (upholding idempotence),
while combining two unequal element results in \(\top\). That last part
matches the way we defined "least upper bound" earlier.
The intuition is as follows: the \((\sqcup)\) operator is like an "or". Then,
"anything or positive" means "anything"; same with "anything or negative", etc.
On the other hand, "impossible or positive" means positive, since one of those
cases will never happen. Finally, in the absense of additional elements, the
most we can say about "positive or negative" is "any sign"; of course,
"positive or positive" is the same as "positive".
The "greatest lower bound" operator is defined by effectively swapping top
and bottom.
{{< latex >}}
x_1 \sqcup x_2 =
\begin{cases}
\bot & x_1 = \bot\ \text{or}\ x_2 = \bot \\
\bot & x_1, x_2 \in S, x_1 \neq x_2 \\
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
x_1 & x_2 = \top \\
x_2 & x_1 = \top
\end{cases}
{{< /latex >}}
For this operator, \(\bot\) is the absorbing element, and \(\top\) is the
identity element. The intuition here is not too different: if
\((\sqcap)\) is like an "and", then "impossible and positive" can't happen;
same with "impossible and negative", and so on. On the other hand,
"anything and positive" clearly means positive. Finally, "negative and positive"
can't happen (again, there is no number that's both positive and negative),
and "positive and positive" is just "positive".
What properties of the underlying set did we use to get this to work? The
only thing we needed is to be able to check and see if two elements are
equal or not; this is called _decidable equality_. Since that's the only
thing we used, this means that we can define an "above/below" lattice like this
for any type for which we can check if two elements are equal. In Agda, I encoded
this using a parameterized module:
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 5 8 >}}
From there, I defined the actual data type as follows:
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 23 26 >}}
From there, I defined the \((\sqcup)\) and \((\sqcap)\) operations almost
exactly to the mathematical equation above (the cases were re-ordered to
improve Agda's reduction behavior). Here's the former:
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 86 93 >}}
And here's the latter:
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 181 188 >}}
The proofs of the lattice properties are straightforward and proceed
by simple case analysis. Unfortunately, Agda doesn't quite seem to
evaluate the binary operator in every context that I would expect it to,
which has led me to define some helper lemmas such as the following:
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 95 96 >}}
As a sample, here's a proof of commutativity of \((\sqcup)\):
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 158 165 >}}
The details of the rest of the proofs can be found in the
[`AboveBelow.agda` file]({{< codeurl "agda-spa/Lattice/AboveBelow.agda" >}}).
To recover the sign lattice we've been talking about all along, it's sufficient
to define a sign data type:
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 19 22 >}}
Then, prove decidable equality on it (effecitly defining a comparison function),
and instantiate the `AboveBelow` module:
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 34 47 >}}
### Making Lattices out of Lattices
Natural numbers and signs alone are cool enough, but they will not be sufficient
to write program analyzers. That's because when we're writing an analyzer,
we don't just care about one variable: we care about all of them! An
initial guess might be to say that when analyzing a program, we really need
_several_ signs: one for each variable. This might be reminiscent of a
[map](https://en.wikipedia.org/wiki/Associative_array). So, when we compare
specificity, we'll really be comparing the specificity of maps. Even that,
though, is not enough. The reason is that variables might have different
signs at different points in the program! A single map would not be able to
capture that sort of nuance, so what we really need is a map associating
states with another map, which in turn associates variables with their signs.
Mathematically, we might write this as:
{{< latex >}}
\text{Info} \triangleq \text{ProgramStates} \to (\text{Variables} \to \text{Sign})
{{< /latex >}}
That's a big step up in complexity. We now have a doubly-nested map structure
instead of just a sign. and we need to compare such maps in order to gaugage
their specificity and advance our analyses. But where do we even start with
maps, and how do we define the \((\sqcup)\) and \((\sqcap)\) operations?
The solution turns out to be to define ways in which simpler lattices
(like our sign) can be combined and transformed to define more complex lattices.