Compare commits
No commits in common. "6179c869196032d20f85e73e789f6bc12ae36091" and "c6e2ecb996fe928ab38f868b0083b40931072e5d" have entirely different histories.
6179c86919
...
c6e2ecb996
109
content/blog/00_spa_agda.md
Normal file
109
content/blog/00_spa_agda.md
Normal file
|
@ -0,0 +1,109 @@
|
||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
date: 2024-03-11T14:23:03-07:00
|
||||||
|
draft: true
|
||||||
|
---
|
||||||
|
|
||||||
|
Some years ago, when the Programming Languages research group at Oregon State
|
||||||
|
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
||||||
|
lecture notes came up. The group didn't end up reading the lecture notes,
|
||||||
|
but I did. As I was going through them, I noticed that they were quite rigorous:
|
||||||
|
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
||||||
|
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
||||||
|
When I went to implement the algorithms in the textbook, I realized that just
|
||||||
|
writing them down would not be enough. After all, the textbook also proves
|
||||||
|
several properties of the lattice-based analyses, which would be lost in
|
||||||
|
translation if I were to just write C++ or Haskell.
|
||||||
|
|
||||||
|
At the same time, I noticed that lots of recent papers in programming language
|
||||||
|
theory were formalizing their results in
|
||||||
|
[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
|
||||||
|
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
||||||
|
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
||||||
|
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
||||||
|
Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_
|
||||||
|
in Agda.
|
||||||
|
|
||||||
|
In this post (and subsequent follow-ups), I hope to describe what I have so far.
|
||||||
|
Although not everything works, formalizing fixed-point algorithms (I'll get back
|
||||||
|
to this) and getting even the most straightforward analysis working was no
|
||||||
|
joke, and I think is worthy of discussion. So, here goes.
|
||||||
|
|
||||||
|
### Monotone Frameworks
|
||||||
|
|
||||||
|
I'll start out as abstractly and vaguely as possible. In general, the sorts of
|
||||||
|
analyses I'll be formalizing are based on _monotone frameworks_.
|
||||||
|
The idea with monotone frameworks is to rank information about program state
|
||||||
|
using some kind of _order_. Intuitively, given two pieces of "information",
|
||||||
|
one is less than another if it's more specific. Thus, "`x` has a positive sign"
|
||||||
|
is less than "`x` has any sign", since the former is more specific than the latter.
|
||||||
|
The sort of information that you are comparing depends on the analysis. In all
|
||||||
|
cases, the analysis itself is implemented as a function that takes the "information
|
||||||
|
so far", and updates it based on the program, producing "updated information so far".
|
||||||
|
Not all such functions are acceptable; it's possible to write an "updater function"
|
||||||
|
that keeps slightly adjusting its answer. Such a function could keep running
|
||||||
|
forever, which is a little too long for a program analyzer. We need something
|
||||||
|
to ensure the analysis ends.
|
||||||
|
|
||||||
|
There are two secret ingredients to ensure that an analysis terminates.
|
||||||
|
The first is a property called _monotonicity_; a function is monotonic if
|
||||||
|
it preserves the order between its inputs. That is, if you have two pieces of
|
||||||
|
information `x1` and `x2`, with `x1 <= x2`, then `f(x1) <= f(x2)`. The second
|
||||||
|
property is that our "information" has a _finite height_. Roughly, this means
|
||||||
|
that if you tried to arrange pieces of information in a line, from least to
|
||||||
|
greatest, your line could only get so long. Combined, this leads to the
|
||||||
|
following property (I'm being reductive here while I give an overview):
|
||||||
|
|
||||||
|
_With a monotoninc function and a finite-height order, if you start at the
|
||||||
|
bottom, each invocation of the function moves you up some line. Since the
|
||||||
|
line can only be so long, you're guaranteed to reach the end eventually._
|
||||||
|
|
||||||
|
The above three-paragraph explanation omits a lot of details, but it's a start.
|
||||||
|
To get more precise, we must drill down into several aspects of what I've
|
||||||
|
said so far. The first of them is, __how can we compare program states using
|
||||||
|
an order?__
|
||||||
|
|
||||||
|
### Lattices
|
||||||
|
|
||||||
|
The "information" we'll be talking about will form an algebraic structure
|
||||||
|
called a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Algebraically,
|
||||||
|
a lattice is simply a set with two binary operations on it. Unlike the familiar
|
||||||
|
`+`, `-`, and `*` and `/`, the binary operations on a lattice are called
|
||||||
|
"join" and "meet", and are written as `⊔` and `⊓`. Intuitively, they correspond
|
||||||
|
to "take the maximum of two values" and "take the minimum of two values". That
|
||||||
|
may not be all that surprising, since it's the order of values that we care about.
|
||||||
|
Continuing the analogy, let's talk some properties of "minimum" and "maximum",
|
||||||
|
|
||||||
|
* \\(\\max(a, a) = \\min(a, a) = a\\). The minimum and maximum of one number is
|
||||||
|
just that number. Mathematically, this property is called _idempotence_.
|
||||||
|
* \\(\\max(a, b) = \\max(b, a)\\). If you're taking the maximum of two numbers,
|
||||||
|
it doesn't matter much one you consider first. This property is called
|
||||||
|
_commutativity_.
|
||||||
|
* \\(\\max(a, \\max(b, c)) = \\max(\\max(a, b), c)\\). When you have three numbers,
|
||||||
|
and you're determining the maximum value, it doesn't matter which pair of
|
||||||
|
numbers you compare first. This property is called _associativity_.
|
||||||
|
|
||||||
|
All of the properties of \\(\\max\\) also hold for \\(\\min\\). There are also
|
||||||
|
a couple of facts about how \\(\\max\\) and \\(\\min\\) interact _with each other_.
|
||||||
|
They are usually called the _absorption laws_:
|
||||||
|
|
||||||
|
* \\(\\max(a, \\min(a, b)) = a\\). This one is a little less obvious; \\(a\\)
|
||||||
|
is either less than or bigger than \\(b\\); so if you try to find the maximum
|
||||||
|
__and__ the minimum of \\(a\\) and \\(b\\), one of the operations will return
|
||||||
|
\\(a\\).
|
||||||
|
* \\(\\min(a, \\max(a, b)) = a\\). The reason for this one is the same as
|
||||||
|
the reason above.
|
||||||
|
|
||||||
|
Lattices model a specific kind of order; their operations are meant to
|
||||||
|
generalize \\(\\min\\) and \\(\\max\\). Thus, to make the operations behave
|
||||||
|
as expected (i.e., the way that \\(\\min\\) and \\(\\max\\) do), they are
|
||||||
|
required to have all of the properties we've listed so far. We can summarize
|
||||||
|
the properties in table.
|
||||||
|
|
||||||
|
| Property Name | Definition |
|
||||||
|
|---------------|:----------------------------------------------------:|
|
||||||
|
| Idempotence | {{< latex >}}\forall x. x \sqcup x = x{{< /latex >}}<br>{{< latex >}}\forall x. x \sqcap x = x{{< /latex >}} |
|
||||||
|
| Commutativity | {{< latex >}}\forall x, y. x \sqcup y = y \sqcup x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap y = y \sqcap x{{< /latex >}} |
|
||||||
|
| Associativity | {{< latex >}}\forall x, y, z. x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z{{< /latex >}}<br>{{< latex >}}\forall x, y, z. x \sqcap (y \sqcap z) = (x \sqcap y) \sqcap z{{< /latex >}} |
|
||||||
|
| Absorption Laws | {{< latex >}}\forall x, y. x \sqcup (x \sqcap y) = x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap (x \sqcup y) = x{{< /latex >}}
|
|
@ -1,89 +0,0 @@
|
||||||
---
|
|
||||||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
|
|
||||||
series: "Static Program Analysis in Agda"
|
|
||||||
description: "In this post, I give a top-level overview of my work on formally verified static analyses"
|
|
||||||
date: 2024-04-12T14:23:03-07:00
|
|
||||||
draft: true
|
|
||||||
---
|
|
||||||
|
|
||||||
Some years ago, when the Programming Languages research group at Oregon State
|
|
||||||
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
|
||||||
lecture notes came up. The group didn't end up reading the lecture notes,
|
|
||||||
but I did. As I was going through them, I noticed that they were quite rigorous:
|
|
||||||
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
|
||||||
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
|
||||||
When I went to implement the algorithms in the textbook, I realized that just
|
|
||||||
writing them down would not be enough. After all, the textbook also proves
|
|
||||||
several properties of the lattice-based analyses, which would be lost in
|
|
||||||
translation if I were to just write C++ or Haskell.
|
|
||||||
|
|
||||||
At the same time, I noticed that lots of recent papers in programming language
|
|
||||||
theory were formalizing their results in
|
|
||||||
[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
|
|
||||||
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
|
||||||
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
|
||||||
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
|
||||||
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
|
|
||||||
in Agda.
|
|
||||||
|
|
||||||
In all, I built a framework for static analyses, based on a tool
|
|
||||||
called _motone functions_. This framework can be used to implement and
|
|
||||||
reason about many different analyses (currently only a certain class called
|
|
||||||
_forward analyses_, but that's not hard limitation). Recently, I've proven
|
|
||||||
the _correctness_ of the algorithms my framework produces. Having reached
|
|
||||||
this milestone, I'd like to pause and talk about what I've done.
|
|
||||||
|
|
||||||
In subsequent posts in this series, will describe what I have so far.
|
|
||||||
It's not perfect, and some work is yet to be done; however, getting to
|
|
||||||
this point was no joke, and I think it's worth discussing. In all,
|
|
||||||
I envision three major topics to cover, each of which is likely going to make
|
|
||||||
for a post or two:
|
|
||||||
|
|
||||||
* __Lattices__: the analyses I'm reasoning about use an algebraic structure
|
|
||||||
called a _lattice_. This structure has certain properties that make it
|
|
||||||
amenable to describing degrees of "knowledge" about a program. In
|
|
||||||
lattice-based static program analysis, the various elements of the
|
|
||||||
lattice represent different facts or properties that we know about the
|
|
||||||
program in question; operations on the lattice help us combine these facts
|
|
||||||
and reason about them.
|
|
||||||
|
|
||||||
Interestingly, lattices can be made by combining other lattices in certain
|
|
||||||
ways. We can therefore use simpler lattices as building blocks to create
|
|
||||||
more complex ones, all while preserving the algebraic structure that
|
|
||||||
we need for program analysis.
|
|
||||||
|
|
||||||
* __The Fixed-Point Algorithm__: to analyze a program, we use information
|
|
||||||
that we already know to compute additional information. For instance,
|
|
||||||
we might use the fact that `1` is positive to compute the fact that
|
|
||||||
`1+1` is positive as well. Using that information, we can determine the
|
|
||||||
sign of `(1+1)+1`, and so on. In practice, this is often done by calling
|
|
||||||
some kind of "analyze" function over and over, each time getting closer to an
|
|
||||||
accurate characterization of the program's behavior. When the output of "analyze"
|
|
||||||
stops changing, we know we've found as much as we can find, and stop.
|
|
||||||
|
|
||||||
What does it mean for the output to stop changing? Roughly, that's when
|
|
||||||
the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
|
|
||||||
this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
|
|
||||||
Crucially, not all functions have fixed points; however, certain types of
|
|
||||||
functions on lattices do. The fixed-point algorithm is a way to compute these
|
|
||||||
points, and we will use this to drive our analyses.
|
|
||||||
|
|
||||||
* __Correctness__: putting together the work on lattices and the fixed-point
|
|
||||||
algorithm, we can implement a static program analyzer in Agda. However,
|
|
||||||
it's not hard to write an "analyze" function that has a fixed point but
|
|
||||||
produces an incorrect result. Thus, the next step is to prove that the results
|
|
||||||
of our analyzer accurately describe the program in question.
|
|
||||||
|
|
||||||
The interesting aspect of this step is that our program analyzer works
|
|
||||||
on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
|
|
||||||
which are a relatively compiler-centric representation of programs. On the
|
|
||||||
other hand, what the language _actually does_ is defined by its
|
|
||||||
[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
|
|
||||||
which is not at all compiler-centric. We need to connect these two, showing
|
|
||||||
that the CFGs we produce "make sense" for our language, and that given
|
|
||||||
CFGs that make sense, our analysis produces results that match the language's
|
|
||||||
execution.
|
|
||||||
|
|
||||||
{{< todo >}}
|
|
||||||
Once the posts are ready, link them here to add some kind of navigation.
|
|
||||||
{{< /todo >}}
|
|
|
@ -1,340 +0,0 @@
|
||||||
---
|
|
||||||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
|
||||||
series: "Static Program Analysis in Agda"
|
|
||||||
date: 2024-04-12T14:23:03-07:00
|
|
||||||
draft: true
|
|
||||||
---
|
|
||||||
|
|
||||||
This is the first post in a series on
|
|
||||||
[static program analysis in Agda]({{< relref "static-program-analysis-in-agda" >}}).
|
|
||||||
See the [introduction]({{< relref "00_spa_agda_intro" >}}) for a little bit
|
|
||||||
more context.
|
|
||||||
|
|
||||||
The goal of this post is to motivate the algebraic structure called a
|
|
||||||
[lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Lattices have
|
|
||||||
{{< sidenote "right" "crdt-note" "broad applications" >}}
|
|
||||||
See, for instance, Lars Hupel's excellent
|
|
||||||
<a href="https://lars.hupel.info/topics/crdt/01-intro/">introduction to CRDTs</a>
|
|
||||||
which uses lattices for Conflict-Free Replicated Data Types. CRDTs can be
|
|
||||||
used to implement peer-to-peer distributed systems.
|
|
||||||
{{< /sidenote >}} beyond static program analysis, so the work in this post is
|
|
||||||
interesting in its own right. However, for the purposes of this series, I'm
|
|
||||||
most interested in lattices as an encoding of program information when performing
|
|
||||||
analysis. To start motivating lattices in that context, I'll need to start
|
|
||||||
with _monotone frameworks_.
|
|
||||||
|
|
||||||
### Monotone Frameworks
|
|
||||||
|
|
||||||
The key notion for monotone frameworks is the "specificity" of information.
|
|
||||||
Take, for instance, an analyzer that tries to figure out if a variable is
|
|
||||||
positive, negative, or equal to zero (this is called a _sign analysis_, and
|
|
||||||
we'll be using this example a lot). Of course, the variable could be "none
|
|
||||||
of the above" -- perhaps if it was initialized from user input, which would
|
|
||||||
allow both positive and negative numbers. Such an analyzer might return
|
|
||||||
`+`, `-`, `0`, or `unknown` for any given variable. These outputs are not
|
|
||||||
created equal: if a variable has sign `+`, we know more about it than if
|
|
||||||
the sign is `unknown`: we've ruled out negative numbers as possible values!
|
|
||||||
|
|
||||||
Specificity is important to us because we want our analyses to be as precise
|
|
||||||
as possible. It would be valid for a program analysis to just return
|
|
||||||
`unknown` for everything, but it wouldn't be very useful. Thus, we want to
|
|
||||||
rank possible outputs, and try pick the most specific one. The
|
|
||||||
{{< sidenote "right" "convention-note" "convention" -12 >}}
|
|
||||||
I say convention, because it doesn't actually matter if we represent more
|
|
||||||
specific values as "larger" or "smaller". Given a lattice with a particular
|
|
||||||
order written as <code><</code>, we can flip the sign in all relations
|
|
||||||
(turning <code>a < b</code> into <code>a > b</code>), and get back another
|
|
||||||
lattice. This lattice will have the same properties (more precisely,
|
|
||||||
the properties will be
|
|
||||||
<a href="https://en.wikipedia.org/wiki/Duality_(mathematics)">dual</a>). So
|
|
||||||
we shouldn't fret about picking a direction for "what's less than what".
|
|
||||||
{{< /sidenote >}}
|
|
||||||
seems to be to make
|
|
||||||
{{< sidenote "right" "order-note" "more specific things \"smaller\"" 1 >}}
|
|
||||||
Admittedly, it's a little bit odd to say that something which is "more" than
|
|
||||||
something else is actually smaller. The intuition that I favor is that
|
|
||||||
something that's more specific describes fewer objects: there are less
|
|
||||||
white horses than horses, so "white horse" is more specific than "horse".
|
|
||||||
The direction of <code><</code> can be thought of as comparing the number
|
|
||||||
of objects.<br>
|
|
||||||
<br>
|
|
||||||
Note that this is only an intuition; there are equally many positive and
|
|
||||||
negative numbers, but we will <em>not</em> group them together
|
|
||||||
in our order.
|
|
||||||
{{< /sidenote >}},
|
|
||||||
and less specific things "larger". Coming back to our previous example, we'd
|
|
||||||
write `+ < unknown`, since `+` is more specific. Of course, the exact
|
|
||||||
things we're trying to rank depend on the sort of analysis we're trying to
|
|
||||||
perform. Since I introduced sign analysis, we're ranking signs like `+` and `-`.
|
|
||||||
For other analyses, the elements will be different. The _comparison_, however,
|
|
||||||
will be a permanent fixture.
|
|
||||||
|
|
||||||
Suppose now that we have some program analysis, and we're feeding it some input
|
|
||||||
information. Perhaps we're giving it the signs of variables `x` and `y`, and
|
|
||||||
hoping for it to give us the sign of a third variable `z`. It would be very
|
|
||||||
unfortunate if, when given more specific information, the analysis would return
|
|
||||||
a less specific output! The more you know going in, the more you should know
|
|
||||||
coming out. Similarly, when given less specific / vaguer information, the
|
|
||||||
analysis shouldn't produce a more specific answer -- how could it do that?
|
|
||||||
This leads us to come up with the following rule:
|
|
||||||
|
|
||||||
{{< latex >}}
|
|
||||||
\textbf{if}\ \text{input}_1 \le \text{input}_2,
|
|
||||||
\textbf{then}\ \text{analyze}(\text{input}_1) \le \text{analyze}(\text{input}_2)
|
|
||||||
{{< /latex >}}
|
|
||||||
|
|
||||||
In mathematics, such a property is called _monotonicity_. We say that
|
|
||||||
"analyze" is a [monotonic function](https://en.wikipedia.org/wiki/Monotonic_function).
|
|
||||||
This property gives its name to monotone frameworks. For our purposes, this
|
|
||||||
property means that being more specific "pays off": better information in
|
|
||||||
means better information out. In Agda, we can encode monotonicity as follows:
|
|
||||||
|
|
||||||
{{< codelines "Agda" "agda-spa/Lattice.agda" 17 21 >}}
|
|
||||||
|
|
||||||
Note that above, I defined `Monotonic` on an arbitrary function, whose
|
|
||||||
outputs might be of a different type than its inputs. This will come in handy
|
|
||||||
later.
|
|
||||||
|
|
||||||
The order `<` of our elements and the monotonicity of our analysis are useful
|
|
||||||
to us for another reason: they help gauge and limit, in a roundabout way, how much
|
|
||||||
work might be left for our analysis to do. This matters because we don't want
|
|
||||||
to allow analyses that can take forever to finish -- that's a little too long
|
|
||||||
for a pragmatic tool used by people.
|
|
||||||
|
|
||||||
The key observation -- which I will describe in detail in a later post --
|
|
||||||
is that a monotonic analysis, in a way, "climbs upwards" through an
|
|
||||||
order. As we continue using this analysis to refine information over and over,
|
|
||||||
its results get
|
|
||||||
{{< sidenote "right" "less-specific-note" "less and less specific." >}}
|
|
||||||
It is not a bad thing for our results to get less specific over time, because
|
|
||||||
our initial information is probably incomplete. If you've only seen German
|
|
||||||
shepherds in your life, that might be your picture of what a dog is like.
|
|
||||||
If you then come across a chihuahua, your initial definition of "dog" would
|
|
||||||
certainly not accommodate it. To allow for both German shepherds and chihuahuas,
|
|
||||||
you'd have to loosen the definition of "dog". This new definition would be less
|
|
||||||
specific, but it would be more accurate.
|
|
||||||
{{< /sidenote >}}
|
|
||||||
If we add an additional ingredient, and say that the order has a _fixed height_,
|
|
||||||
we can deduce that the analysis will eventually stop producing additional
|
|
||||||
information: either it will keep "climbing", and reach the top (thus having
|
|
||||||
to stop), or it will stop on its own before reaching the top. This is
|
|
||||||
the essence of the fixed-point algorithm, which in Agda-like pseudocode can
|
|
||||||
be stated as follows:
|
|
||||||
|
|
||||||
```Agda
|
|
||||||
module _ (IsFiniteHeight A ≺)
|
|
||||||
(f : A → A)
|
|
||||||
(Monotonicᶠ : Monotonic _≼_ _≼_ f) where
|
|
||||||
-- There exists a point...
|
|
||||||
aᶠ : A
|
|
||||||
|
|
||||||
-- Such that applying the monotonic function doesn't change the result.
|
|
||||||
aᶠ≈faᶠ : aᶠ ≈ f aᶠ
|
|
||||||
```
|
|
||||||
|
|
||||||
Moreover, the value we'll get out of the fixed point algorithm will be
|
|
||||||
the _least fixed point_. For us, this means that the result will be
|
|
||||||
"the most specific result possible".
|
|
||||||
|
|
||||||
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 80 80 >}}
|
|
||||||
|
|
||||||
The above explanation omits a lot of details, but it's a start. To get more
|
|
||||||
precise, we must drill down into several aspects of what I've said so far.
|
|
||||||
The first of them is, __how can we compare program information using an order?__
|
|
||||||
|
|
||||||
### Lattices
|
|
||||||
|
|
||||||
Let's start with a question: when it comes to our specificity-based order,
|
|
||||||
is `-` less than, greater than, or equal to `+`? Surely it's not less specific;
|
|
||||||
knowing that a number is negative doesn't give you less information than
|
|
||||||
knowing if that number is positive. Similarly, it's not any more specific, for
|
|
||||||
the same reason. You could consider it equally specific, but that doesn't
|
|
||||||
seem quite right either; the information is different, so comparing specificity
|
|
||||||
feels apples-to-oranges. On the other hand, both `+` and `-` are clearly
|
|
||||||
more specific than `unknown`.
|
|
||||||
|
|
||||||
The solution to this conundrum is to simply refuse to compare certain elements:
|
|
||||||
`+` is neither less than, greater than, nor equal to `-`, but `+ < unknown` and
|
|
||||||
`- < unknown`. Such an ordering is called a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set).
|
|
||||||
|
|
||||||
Next, another question. Suppose that the user writes code like this:
|
|
||||||
|
|
||||||
```
|
|
||||||
if someCondition {
|
|
||||||
x = exprA;
|
|
||||||
} else {
|
|
||||||
x = exprB;
|
|
||||||
}
|
|
||||||
y = x;
|
|
||||||
```
|
|
||||||
|
|
||||||
If `exprA` has sign `s1`, and `exprB` has sign `s2`, what's the sign of `y`?
|
|
||||||
It's not necessarily `s1` nor `s2`, since they might not match: `s1` could be `+`,
|
|
||||||
and `s2` could be `-`, and using either `+` or `-` for `y` would be incorrect.
|
|
||||||
We're looking for something that can encompass _both_ `s1` and `s2`.
|
|
||||||
Necessarily, it would be either equally specific or less specific than
|
|
||||||
either `s1` or `s2`: there isn't any new information coming in about `x`,
|
|
||||||
and since we don't know which branch is taken, we stand to lose a little
|
|
||||||
bit of info. However, our goal is always to maximize specificity, since
|
|
||||||
more specific signs give us more information about our program.
|
|
||||||
|
|
||||||
This gives us the following constraints. Since the combined sign `s` has to
|
|
||||||
be equally or less specific than either `s1` and `s2`, we have `s1 <= s` and
|
|
||||||
`s2 <= s`. However, we want to pick `s` such that it's more specific than
|
|
||||||
any other "combined sign" candidate. Thus, if there's another sign `t`,
|
|
||||||
with `s1 <= t` and `s2 <= t`, then it must be less specific than `s`: `s <= t`.
|
|
||||||
|
|
||||||
At first, the above constraints might seem quite complicated. We can interpret
|
|
||||||
them in more familiar territory by looking at numbers instead of signs.
|
|
||||||
If we have two numbers `n1` and `n2`, what number is the smallest number
|
|
||||||
that's bigger than either `n1` or `n2`? Why, the maximum of the two, of course!
|
|
||||||
|
|
||||||
There is a reason why I used the constraints above instead of just saying
|
|
||||||
"maximum". For numbers, `max(a,b)` is either `a` or `b`. However, we saw earlier
|
|
||||||
that neither `+` nor `-` works as the sign for `y` in our program. Moreover,
|
|
||||||
we agreed above that our order is _partial_: how can we pick "the bigger of two
|
|
||||||
elements" if neither is bigger than the other? `max` itself doesn't quite work,
|
|
||||||
but what we're looking for is something similar. Instead, we simply require a
|
|
||||||
similar function for our signs. We call this function "[least upper bound](https://en.wikipedia.org/wiki/Least-upper-bound_property)",
|
|
||||||
since it is the "least (most specific)
|
|
||||||
element that's greater (less specific) than either `s1` or `s2`". Conventionally,
|
|
||||||
this function is written as \\(a \\sqcup b\\) (or in our case, \\(s_1 \\sqcup s_2\\)).
|
|
||||||
We can define it for our signs so far using the following [Cayley table](https://en.wikipedia.org/wiki/Cayley_table).
|
|
||||||
|
|
||||||
{{< latex >}}
|
|
||||||
\begin{array}{c|cccc}
|
|
||||||
\sqcup & - & 0 & + & ? \\
|
|
||||||
\hline
|
|
||||||
- & - & ? & ? & ? \\
|
|
||||||
0 & ? & 0 & ? & ? \\
|
|
||||||
+ & ? & ? & + & ? \\
|
|
||||||
? & ? & ? & ? & ? \\
|
|
||||||
\end{array}
|
|
||||||
{{< /latex >}}
|
|
||||||
|
|
||||||
By using the above table, we can see that \\((+\ \\sqcup\ -)\ =\ ?\\) (aka `unknown`).
|
|
||||||
This is correct; given the four signs we're working with, that's the most we can say.
|
|
||||||
Let's explore the analogy to the `max` function a little bit more, by observing
|
|
||||||
that this function has certain properties:
|
|
||||||
|
|
||||||
* `max(a, a) = a`. The maximum of one number is just that number.
|
|
||||||
Mathematically, this property is called _idempotence_. Note that
|
|
||||||
by inspecting the diagonal of the above table, we can confirm that our
|
|
||||||
\\((\\sqcup)\\) function is idempotent.
|
|
||||||
* `max(a, b) = max(b, a)`. If you're taking the maximum of two numbers,
|
|
||||||
it doesn't matter which one you consider first. This property is called
|
|
||||||
_commutativity_. Note that if you mirror the table along the diagonal,
|
|
||||||
it doesn't change; this shows that our \\((\\sqcup)\\) function is
|
|
||||||
commutative.
|
|
||||||
* `max(a, max(b, c)) = max(max(a, b), c)`. When you have three numbers,
|
|
||||||
and you're determining the maximum value, it doesn't matter which pair of
|
|
||||||
numbers you compare first. This property is called _associativity_. You
|
|
||||||
can use the table above to verify the \\((\\sqcup)\\) is associative, too.
|
|
||||||
|
|
||||||
A set that has a binary operation (like `max` or \\((\\sqcup)\\)) that
|
|
||||||
satisfies the above properties is called a [semilattice](https://en.wikipedia.org/wiki/Semilattice). In Agda, we can write this definition roughly as follows:
|
|
||||||
|
|
||||||
```Agda
|
|
||||||
record IsSemilattice {a} (A : Set a) (_⊔_ : A → A → A) : Set a where
|
|
||||||
field
|
|
||||||
⊔-assoc : (x y z : A) → ((x ⊔ y) ⊔ z) ≡ (x ⊔ (y ⊔ z))
|
|
||||||
⊔-comm : (x y : A) → (x ⊔ y) ≡ (y ⊔ x)
|
|
||||||
⊔-idemp : (x : A) → (x ⊔ x) ≡ x
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that this is an example of the ["Is Something" pattern]({{< relref "agda_is_pattern" >}}).
|
|
||||||
It turns out to be convenient, however, to not require definitional equality
|
|
||||||
(`≡`). For instance, we might model sets as lists. Definitional equality
|
|
||||||
would force us to consider lists with the same elements but a different
|
|
||||||
order to be unequal. Instead, we parameterize our definition of `IsSemilattice`
|
|
||||||
by a binary relation `_≈_`, which we ask to be an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation).
|
|
||||||
|
|
||||||
{{< codelines "Agda" "agda-spa/Lattice.agda" 23 39 >}}
|
|
||||||
|
|
||||||
Notice that the above code also provides -- but doesn't require -- `_≼_` and
|
|
||||||
`_≺_`. That's because a least-upper-bound operation encodes an order:
|
|
||||||
intuitively, if `max(a, b) = b`, then `b` must be larger than `a`.
|
|
||||||
Lars Hupel's CRDT series includes [an explanation](https://lars.hupel.info/topics/crdt/03-lattices/#there-) of how the ordering operator
|
|
||||||
and the "least upper bound" function can be constructed from one another.
|
|
||||||
|
|
||||||
As it turns out, the `min` function has very similar properties to `max`:
|
|
||||||
it's idempotent, commutative, and associative. For a partial order like
|
|
||||||
ours, the analog to `min` is "greatest lower bound", or "the largest value
|
|
||||||
that's smaller than both inputs". Such a function is denoted as \\(a\\sqcap b\\).
|
|
||||||
As for what it means, where \\(s_1 \\sqcup s_2\\) means "combine two signs where
|
|
||||||
you don't know which one will be used" (like in an `if`/`else`),
|
|
||||||
\\(s_1 \\sqcap s_2\\) means "combine two signs where you know both of
|
|
||||||
them to be true". For example, \\((+\ \\sqcap\ ?)\ =\ +\\), because a variable
|
|
||||||
that's both "any sign" and "positive" must be positive.
|
|
||||||
|
|
||||||
There's just one hiccup: what's the greatest lower bound of `+` and `-`?
|
|
||||||
it needs to be a value that's less than both of them, but so far, we don't have
|
|
||||||
such a value. Intuitively, this value should be called something like `impossible`,
|
|
||||||
because a number that's both positive and negative doesn't exist. So, let's
|
|
||||||
extend our analyzer to have a new `impossible` value. In fact, it turns
|
|
||||||
out that this "impossible" value is the least element of our set (we added
|
|
||||||
it to be the lower bound of `+` and co., which in turn are less than `unknown`).
|
|
||||||
Similarly, `unknown` is the largest element of our set, since it's greater
|
|
||||||
than `+` and co, and transitively greater than `impossible`. In mathematics,
|
|
||||||
it's not uncommon to define the least element as \\(\\bot\\) (read "bottom"), and the
|
|
||||||
greatest element as \\(\\top\\) (read "top"). With that in mind, the
|
|
||||||
following are the updated Cayley tables for our operations.
|
|
||||||
|
|
||||||
{{< latex >}}
|
|
||||||
\begin{array}{c|ccccc}
|
|
||||||
\sqcup & - & 0 & + & \top & \bot \\
|
|
||||||
\hline
|
|
||||||
- & - & \top & \top & \top & - \\
|
|
||||||
0 & \top & 0 & \top & \top & 0 \\
|
|
||||||
+ & \top & \top & + & \top & + \\
|
|
||||||
\top & \top & \top & \top & \top & \top \\
|
|
||||||
\bot & - & 0 & + & \top & \bot \\
|
|
||||||
\end{array}
|
|
||||||
|
|
||||||
\qquad
|
|
||||||
|
|
||||||
\begin{array}{c|ccccc}
|
|
||||||
\sqcap & - & 0 & + & \top & \bot \\
|
|
||||||
\hline
|
|
||||||
- & - & \bot & \bot & - & \bot \\
|
|
||||||
0 & \bot & 0 & \bot & 0 & \bot \\
|
|
||||||
+ & \bot & \bot & + & + & \bot \\
|
|
||||||
\top & - & 0 & + & \top & \bot \\
|
|
||||||
\bot & \bot & \bot & \bot & \bot & \bot \\
|
|
||||||
\end{array}
|
|
||||||
{{< /latex >}}
|
|
||||||
|
|
||||||
So, it turns out that our set of possible signs is a semilattice in two
|
|
||||||
ways. And if "semi" means "half", does two "semi"s make a whole? Indeed it does!
|
|
||||||
|
|
||||||
A lattice is made up of two semilattices. The operations of these two lattices,
|
|
||||||
however, must satisfy some additional properties. Let's examine the properties
|
|
||||||
in the context of `min` and `max` as we have before. They are usually called
|
|
||||||
the _absorption laws_:
|
|
||||||
|
|
||||||
* `max(a, min(a, b)) = a`. `a` is either less than or bigger than `b`;
|
|
||||||
so if you try to find the maximum __and__ the minimum of `a` and
|
|
||||||
`b`, one of the operations will return `a`.
|
|
||||||
* `min(a, max(a, b)) = a`. The reason for this one is the same as
|
|
||||||
the reason above.
|
|
||||||
|
|
||||||
In Agda, we can therefore write a lattice as follows:
|
|
||||||
|
|
||||||
{{< codelines "Agda" "agda-spa/Lattice.agda" 153 163 >}}
|
|
||||||
|
|
||||||
### Concrete Example: Natural Numbers
|
|
||||||
|
|
||||||
Since we've been talking about `min` and `max` as motivators for properties
|
|
||||||
of \\((\\sqcap)\\) and \\((\\sqcup)\\), it might not be all that surprising
|
|
||||||
that natural numbers form a lattice with `min` and `max` as the two binary
|
|
||||||
operations. In fact, the Agda standard library writes `min` as `_⊓_` and
|
|
||||||
`max` as `_⊔_`! We can make use of the already-proven properties of these
|
|
||||||
operators to easily define `IsLattice` for natural numbers. Notice that
|
|
||||||
since we're not doing anything clever, like considering lists up to reordering,
|
|
||||||
there's no reason not to use definitional equality `≡` for our equivalence relation.
|
|
||||||
|
|
||||||
{{< codelines "Agda" "agda-spa/Lattice/Nat.agda" 1 45 >}}
|
|
||||||
|
|
||||||
The definition for the lattice instance itself is pretty similar; I'll omit
|
|
||||||
it here to avoid taking up a lot of vertical space, but you can find it
|
|
||||||
on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}).
|
|
Loading…
Reference in New Issue
Block a user