Compare commits
	
		
			4 Commits
		
	
	
		
			c6e2ecb996
			...
			6179c86919
		
	
	| Author | SHA1 | Date | |
|---|---|---|---|
| 6179c86919 | |||
| a20fe07a56 | |||
| 2b5dcf12d7 | |||
| 5873c1ca96 | 
| @ -1,109 +0,0 @@ | |||||||
| --- |  | ||||||
| title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices" |  | ||||||
| series: "Static Program Analysis in Agda" |  | ||||||
| date: 2024-03-11T14:23:03-07:00 |  | ||||||
| draft: true |  | ||||||
| --- |  | ||||||
| 
 |  | ||||||
| Some years ago, when the Programming Languages research group at Oregon State |  | ||||||
| University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/) |  | ||||||
| lecture notes came up. The group didn't end up reading the lecture notes,  |  | ||||||
| but I did. As I was going through them, I noticed that they were quite rigorous: |  | ||||||
| the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)), |  | ||||||
| and the subsequent analyses -- and the descriptions thereof -- are quite precise. |  | ||||||
| When I went to implement the algorithms in the textbook, I realized that just |  | ||||||
| writing them down would not be enough. After all, the textbook also proves |  | ||||||
| several properties of the lattice-based analyses, which would be lost in |  | ||||||
| translation if I were to just write C++ or Haskell. |  | ||||||
| 
 |  | ||||||
| At the same time, I noticed that lots of recent papers in programming language |  | ||||||
| theory were formalizing their results in |  | ||||||
| [Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html). |  | ||||||
| Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}}) |  | ||||||
| [with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}}) |  | ||||||
| [types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out. |  | ||||||
| Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_ |  | ||||||
| in Agda. |  | ||||||
| 
 |  | ||||||
| In this post (and subsequent follow-ups), I hope to describe what I have so far. |  | ||||||
| Although not everything works, formalizing fixed-point algorithms (I'll get back |  | ||||||
| to this) and getting even the most straightforward analysis working was no |  | ||||||
| joke, and I think is worthy of discussion. So, here goes. |  | ||||||
| 
 |  | ||||||
| ### Monotone Frameworks |  | ||||||
| 
 |  | ||||||
| I'll start out as abstractly and vaguely as possible. In general, the sorts of |  | ||||||
| analyses I'll be formalizing are based on _monotone frameworks_. |  | ||||||
| The idea with monotone frameworks is to rank information about program state |  | ||||||
| using some kind of _order_. Intuitively, given two pieces of "information", |  | ||||||
| one is less than another if it's more specific. Thus, "`x` has a positive sign" |  | ||||||
| is less than "`x` has any sign", since the former is more specific than the latter. |  | ||||||
| The sort of information that you are comparing depends on the analysis. In all |  | ||||||
| cases, the analysis itself is implemented as a function that takes the "information |  | ||||||
| so far", and updates it based on the program, producing "updated information so far". |  | ||||||
| Not all such functions are acceptable; it's possible to write an "updater function" |  | ||||||
| that keeps slightly adjusting its answer. Such a function could keep running |  | ||||||
| forever, which is a little too long for a program analyzer. We need something |  | ||||||
| to ensure the analysis ends. |  | ||||||
| 
 |  | ||||||
| There are two secret ingredients to ensure that an analysis terminates. |  | ||||||
| The first is a property called _monotonicity_; a function is monotonic if |  | ||||||
| it preserves the order between its inputs. That is, if you have two pieces of |  | ||||||
| information `x1` and `x2`, with `x1 <= x2`, then `f(x1) <= f(x2)`. The second |  | ||||||
| property is that our "information" has a _finite height_. Roughly, this means |  | ||||||
| that if you tried to arrange pieces of information in a line, from least to |  | ||||||
| greatest, your line could only get so long. Combined, this leads to the |  | ||||||
| following property (I'm being reductive here while I give an overview): |  | ||||||
| 
 |  | ||||||
| _With a monotoninc function and a finite-height order, if you start at the |  | ||||||
| bottom, each invocation of the function moves you up some line. Since the |  | ||||||
| line can only be so long, you're guaranteed to reach the end eventually._ |  | ||||||
| 
 |  | ||||||
| The above three-paragraph explanation omits a lot of details, but it's a start. |  | ||||||
| To get more precise, we must drill down into several aspects of what I've |  | ||||||
| said so far. The first of them is, __how can we compare program states using |  | ||||||
| an order?__ |  | ||||||
| 
 |  | ||||||
| ### Lattices |  | ||||||
| 
 |  | ||||||
| The "information" we'll be talking about will form an algebraic structure |  | ||||||
| called a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Algebraically, |  | ||||||
| a lattice is simply a set with two binary operations on it. Unlike the familiar |  | ||||||
| `+`, `-`, and `*` and `/`, the binary operations on a lattice are called |  | ||||||
| "join" and "meet", and are written as `⊔` and `⊓`. Intuitively, they correspond |  | ||||||
| to "take the maximum of two values" and "take the minimum of two values". That |  | ||||||
| may not be all that surprising, since it's the order of values that we care about. |  | ||||||
| Continuing the analogy, let's talk some properties of "minimum" and "maximum",  |  | ||||||
| 
 |  | ||||||
| * \\(\\max(a, a) = \\min(a, a) = a\\). The minimum and maximum of one number is |  | ||||||
|   just that number. Mathematically, this property is called _idempotence_. |  | ||||||
| * \\(\\max(a, b) = \\max(b, a)\\). If you're taking the maximum of two numbers, |  | ||||||
|   it doesn't matter much one you consider first. This property is called |  | ||||||
|   _commutativity_. |  | ||||||
| * \\(\\max(a, \\max(b, c)) = \\max(\\max(a, b), c)\\). When you have three numbers, |  | ||||||
|   and you're determining the maximum value, it doesn't matter which pair of |  | ||||||
|   numbers you compare first. This property is called _associativity_. |  | ||||||
| 
 |  | ||||||
| All of the properties of \\(\\max\\) also hold for \\(\\min\\). There are also |  | ||||||
| a couple of facts about how \\(\\max\\) and \\(\\min\\) interact _with each other_. |  | ||||||
| They are usually called the _absorption laws_: |  | ||||||
| 
 |  | ||||||
| * \\(\\max(a, \\min(a, b)) = a\\). This one is a little less obvious; \\(a\\) |  | ||||||
|   is either less than or bigger than \\(b\\); so if you try to find the maximum |  | ||||||
|   __and__ the minimum of \\(a\\) and \\(b\\), one of the operations will return |  | ||||||
|   \\(a\\). |  | ||||||
| * \\(\\min(a, \\max(a, b)) = a\\). The reason for this one is the same as |  | ||||||
|   the reason above. |  | ||||||
| 
 |  | ||||||
| Lattices model a specific kind of order; their operations are meant to |  | ||||||
| generalize \\(\\min\\) and \\(\\max\\). Thus, to make the operations behave |  | ||||||
| as expected (i.e., the way that \\(\\min\\) and \\(\\max\\) do), they are |  | ||||||
| required to have all of the properties we've listed so far. We can summarize |  | ||||||
| the properties in table. |  | ||||||
| 
 |  | ||||||
| | Property Name | Definition                                           | |  | ||||||
| |---------------|:----------------------------------------------------:| |  | ||||||
| | Idempotence   | {{< latex >}}\forall x. x \sqcup x = x{{< /latex >}}<br>{{< latex >}}\forall x. x \sqcap x = x{{< /latex >}} | |  | ||||||
| | Commutativity | {{< latex >}}\forall x, y. x \sqcup y = y \sqcup x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap y = y \sqcap x{{< /latex >}} | |  | ||||||
| | Associativity | {{< latex >}}\forall x, y, z. x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z{{< /latex >}}<br>{{< latex >}}\forall x, y, z. x \sqcap (y \sqcap z) = (x \sqcap y) \sqcap z{{< /latex >}} | |  | ||||||
| | Absorption Laws | {{< latex >}}\forall x, y. x \sqcup (x \sqcap y) = x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap (x \sqcup y) = x{{< /latex >}} |  | ||||||
							
								
								
									
										89
									
								
								content/blog/00_spa_agda_intro.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										89
									
								
								content/blog/00_spa_agda_intro.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,89 @@ | |||||||
|  | --- | ||||||
|  | title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro" | ||||||
|  | series: "Static Program Analysis in Agda" | ||||||
|  | description: "In this post, I give a top-level overview of my work on formally verified static analyses" | ||||||
|  | date: 2024-04-12T14:23:03-07:00 | ||||||
|  | draft: true | ||||||
|  | --- | ||||||
|  | 
 | ||||||
|  | Some years ago, when the Programming Languages research group at Oregon State | ||||||
|  | University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/) | ||||||
|  | lecture notes came up. The group didn't end up reading the lecture notes, | ||||||
|  | but I did. As I was going through them, I noticed that they were quite rigorous: | ||||||
|  | the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)), | ||||||
|  | and the subsequent analyses -- and the descriptions thereof -- are quite precise. | ||||||
|  | When I went to implement the algorithms in the textbook, I realized that just | ||||||
|  | writing them down would not be enough. After all, the textbook also proves | ||||||
|  | several properties of the lattice-based analyses, which would be lost in | ||||||
|  | translation if I were to just write C++ or Haskell. | ||||||
|  | 
 | ||||||
|  | At the same time, I noticed that lots of recent papers in programming language | ||||||
|  | theory were formalizing their results in | ||||||
|  | [Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html). | ||||||
|  | Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}}) | ||||||
|  | [with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}}) | ||||||
|  | [types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out. | ||||||
|  | Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_ | ||||||
|  | in Agda. | ||||||
|  | 
 | ||||||
|  | In all, I built a framework for static analyses, based on a tool | ||||||
|  | called _motone functions_. This framework can be used to implement and | ||||||
|  | reason about many different analyses (currently only a certain class called | ||||||
|  | _forward analyses_, but that's not hard limitation). Recently, I've proven | ||||||
|  | the _correctness_ of the algorithms my framework produces. Having reached | ||||||
|  | this milestone, I'd like to pause and talk about what I've done. | ||||||
|  | 
 | ||||||
|  | In subsequent posts in this series, will describe what I have so far. | ||||||
|  | It's not perfect, and some work is yet to be done; however, getting to | ||||||
|  | this point was no joke, and I think it's worth discussing. In all, | ||||||
|  | I envision three major topics to cover, each of which is likely going to make | ||||||
|  | for a post or two: | ||||||
|  | 
 | ||||||
|  | * __Lattices__: the analyses I'm reasoning about use an algebraic structure | ||||||
|  |   called a _lattice_. This structure has certain properties that make it | ||||||
|  |   amenable to describing degrees of "knowledge" about a program. In | ||||||
|  |   lattice-based static program analysis, the various elements of the | ||||||
|  |   lattice represent different facts or properties that we know about the | ||||||
|  |   program in question; operations on the lattice help us combine these facts | ||||||
|  |   and reason about them. | ||||||
|  | 
 | ||||||
|  |   Interestingly, lattices can be made by combining other lattices in certain | ||||||
|  |   ways. We can therefore use simpler lattices as building blocks to create | ||||||
|  |   more complex ones, all while preserving the algebraic structure that | ||||||
|  |   we need for program analysis. | ||||||
|  | 
 | ||||||
|  | * __The Fixed-Point Algorithm__: to analyze a program, we use information | ||||||
|  |   that we already know to compute additional information. For instance, | ||||||
|  |   we might use the fact that `1` is positive to compute the fact that | ||||||
|  |   `1+1` is positive as well. Using that information, we can determine the | ||||||
|  |   sign of `(1+1)+1`, and so on. In practice, this is often done by calling | ||||||
|  |   some kind of "analyze" function over and over, each time getting closer to an | ||||||
|  |   accurate characterization of the program's behavior. When the output of "analyze" | ||||||
|  |   stops changing, we know we've found as much as we can find, and stop. | ||||||
|  | 
 | ||||||
|  |   What does it mean for the output to stop changing? Roughly, that's when | ||||||
|  |   the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics, | ||||||
|  |   this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html). | ||||||
|  |   Crucially, not all functions have fixed points; however, certain types of | ||||||
|  |   functions on lattices do. The fixed-point algorithm is a way to compute these | ||||||
|  |   points, and we will use this to drive our analyses. | ||||||
|  | 
 | ||||||
|  | * __Correctness__: putting together the work on lattices and the fixed-point | ||||||
|  |   algorithm, we can implement a static program analyzer in Agda. However, | ||||||
|  |   it's not hard to write an "analyze" function that has a fixed point but | ||||||
|  |   produces an incorrect result. Thus, the next step is to prove that the results | ||||||
|  |   of our analyzer accurately describe the program in question. | ||||||
|  | 
 | ||||||
|  |   The interesting aspect of this step is that our program analyzer works | ||||||
|  |   on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs), | ||||||
|  |   which are a relatively compiler-centric representation of programs. On the | ||||||
|  |   other hand, what the language _actually does_ is defined by its | ||||||
|  |   [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)), | ||||||
|  |   which is not at all compiler-centric. We need to connect these two, showing | ||||||
|  |   that the CFGs we produce "make sense" for our language, and that given | ||||||
|  |   CFGs that make sense, our analysis produces results that match the language's | ||||||
|  |   execution. | ||||||
|  | 
 | ||||||
|  | {{< todo >}} | ||||||
|  | Once the posts are ready, link them here to add some kind of navigation. | ||||||
|  | {{< /todo >}} | ||||||
							
								
								
									
										340
									
								
								content/blog/01_spa_agda_lattices.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										340
									
								
								content/blog/01_spa_agda_lattices.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,340 @@ | |||||||
|  | --- | ||||||
|  | title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices" | ||||||
|  | series: "Static Program Analysis in Agda" | ||||||
|  | date: 2024-04-12T14:23:03-07:00 | ||||||
|  | draft: true | ||||||
|  | --- | ||||||
|  | 
 | ||||||
|  | This is the first post in a series on | ||||||
|  | [static program analysis in Agda]({{< relref "static-program-analysis-in-agda" >}}). | ||||||
|  | See the [introduction]({{< relref "00_spa_agda_intro" >}}) for a little bit | ||||||
|  | more context. | ||||||
|  | 
 | ||||||
|  | The goal of this post is to motivate the algebraic structure called a | ||||||
|  | [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Lattices have | ||||||
|  | {{< sidenote "right" "crdt-note" "broad applications" >}} | ||||||
|  | See, for instance, Lars Hupel's excellent | ||||||
|  | <a href="https://lars.hupel.info/topics/crdt/01-intro/">introduction to CRDTs</a> | ||||||
|  | which uses lattices for Conflict-Free Replicated Data Types. CRDTs can be | ||||||
|  | used to implement peer-to-peer distributed systems. | ||||||
|  | {{< /sidenote >}} beyond static program analysis, so the work in this post is | ||||||
|  | interesting in its own right. However, for the purposes of this series, I'm | ||||||
|  | most interested in lattices as an encoding of program information when performing | ||||||
|  | analysis. To start motivating lattices in that context, I'll need to start | ||||||
|  | with _monotone frameworks_. | ||||||
|  | 
 | ||||||
|  | ### Monotone Frameworks | ||||||
|  | 
 | ||||||
|  | The key notion for monotone frameworks is the "specificity" of information. | ||||||
|  | Take, for instance, an analyzer that tries to figure out if a variable is | ||||||
|  | positive, negative, or equal to zero (this is called a _sign analysis_, and | ||||||
|  | we'll be using this example a lot). Of course, the variable could be "none | ||||||
|  | of the above" -- perhaps if it was initialized from user input, which would | ||||||
|  | allow both positive and negative numbers. Such an analyzer might return | ||||||
|  | `+`, `-`, `0`, or `unknown` for any given variable. These outputs are not | ||||||
|  | created equal: if a variable has sign `+`, we know more about it than if | ||||||
|  | the sign is `unknown`: we've ruled out negative numbers as possible values! | ||||||
|  | 
 | ||||||
|  | Specificity is important to us because we want our analyses to be as precise | ||||||
|  | as possible. It would be valid for a program analysis to just return | ||||||
|  | `unknown` for everything, but it wouldn't be very useful. Thus, we want to | ||||||
|  | rank possible outputs, and try pick the most specific one. The | ||||||
|  | {{< sidenote "right" "convention-note" "convention" -12 >}} | ||||||
|  | I say convention, because it doesn't actually matter if we represent more | ||||||
|  | specific values as "larger" or "smaller". Given a lattice with a particular | ||||||
|  | order written as <code><</code>, we can flip the sign in all relations | ||||||
|  | (turning <code>a < b</code> into <code>a > b</code>), and get back another | ||||||
|  | lattice. This lattice will have the same properties (more precisely, | ||||||
|  | the properties will be | ||||||
|  | <a href="https://en.wikipedia.org/wiki/Duality_(mathematics)">dual</a>). So | ||||||
|  | we shouldn't fret about picking a direction for "what's less than what". | ||||||
|  | {{< /sidenote >}} | ||||||
|  | seems to be to make | ||||||
|  | {{< sidenote "right" "order-note" "more specific things \"smaller\"" 1 >}} | ||||||
|  | Admittedly, it's a little bit odd to say that something which is "more" than | ||||||
|  | something else is actually smaller. The intuition that I favor is that | ||||||
|  | something that's more specific describes fewer objects: there are less | ||||||
|  | white horses than horses, so "white horse" is more specific than "horse". | ||||||
|  | The direction of <code><</code> can be thought of as comparing the number | ||||||
|  | of objects.<br> | ||||||
|  | <br> | ||||||
|  | Note that this is only an intuition; there are equally many positive and | ||||||
|  | negative numbers, but we will <em>not</em> group them together | ||||||
|  | in our order. | ||||||
|  | {{< /sidenote >}}, | ||||||
|  | and less specific things "larger". Coming back to our previous example, we'd | ||||||
|  | write `+ < unknown`, since `+` is more specific. Of course, the exact | ||||||
|  | things we're trying to rank depend on the sort of analysis we're trying to | ||||||
|  | perform. Since I introduced sign analysis, we're ranking signs like `+` and `-`. | ||||||
|  | For other analyses, the elements will be different. The _comparison_, however, | ||||||
|  | will be a permanent fixture. | ||||||
|  | 
 | ||||||
|  | Suppose now that we have some program analysis, and we're feeding it some input | ||||||
|  | information. Perhaps we're giving it the signs of variables `x` and `y`, and | ||||||
|  | hoping for it to give us the sign of a third variable `z`. It would be very | ||||||
|  | unfortunate if, when given more specific information, the analysis would return | ||||||
|  | a less specific output! The more you know going in, the more you should know | ||||||
|  | coming out. Similarly, when given less specific / vaguer information, the | ||||||
|  | analysis shouldn't produce a more specific answer -- how could it do that? | ||||||
|  | This leads us to come up with the following rule: | ||||||
|  | 
 | ||||||
|  | {{< latex >}} | ||||||
|  | \textbf{if}\ \text{input}_1 \le \text{input}_2, | ||||||
|  | \textbf{then}\ \text{analyze}(\text{input}_1) \le \text{analyze}(\text{input}_2) | ||||||
|  | {{< /latex >}} | ||||||
|  | 
 | ||||||
|  | In mathematics, such a property is called _monotonicity_. We say that | ||||||
|  | "analyze" is a [monotonic function](https://en.wikipedia.org/wiki/Monotonic_function). | ||||||
|  | This property gives its name to monotone frameworks. For our purposes, this | ||||||
|  | property means that being more specific "pays off": better information in | ||||||
|  | means better information out. In Agda, we can encode monotonicity as follows: | ||||||
|  | 
 | ||||||
|  | {{< codelines "Agda" "agda-spa/Lattice.agda" 17 21 >}} | ||||||
|  | 
 | ||||||
|  | Note that above, I defined `Monotonic` on an arbitrary function, whose | ||||||
|  | outputs might be of a different type than its inputs. This will come in handy | ||||||
|  | later. | ||||||
|  | 
 | ||||||
|  | The order `<` of our elements and the monotonicity of our analysis are useful | ||||||
|  | to us for another reason: they help gauge and limit, in a roundabout way, how much | ||||||
|  | work might be left for our analysis to do. This matters because we don't want | ||||||
|  | to allow analyses that can take forever to finish -- that's a little too long | ||||||
|  | for a pragmatic tool used by people. | ||||||
|  | 
 | ||||||
|  | The key observation -- which I will describe in detail in a later post -- | ||||||
|  | is that a monotonic analysis, in a way, "climbs upwards" through an | ||||||
|  | order. As we continue using this analysis to refine information over and over, | ||||||
|  | its results get | ||||||
|  | {{< sidenote "right" "less-specific-note" "less and less specific." >}} | ||||||
|  | It is not a bad thing for our results to get less specific over time, because | ||||||
|  | our initial information is probably incomplete. If you've only seen German | ||||||
|  | shepherds in your life, that might be your picture of what a dog is like. | ||||||
|  | If you then come across a chihuahua, your initial definition of "dog" would | ||||||
|  | certainly not accommodate it. To allow for both German shepherds and chihuahuas, | ||||||
|  | you'd have to loosen the definition of "dog". This new definition would be less | ||||||
|  | specific, but it would be more accurate. | ||||||
|  | {{< /sidenote >}} | ||||||
|  | If we add an additional ingredient, and say that the order has a _fixed height_, | ||||||
|  | we can deduce that the analysis will eventually stop producing additional | ||||||
|  | information: either it will keep "climbing", and reach the top (thus having | ||||||
|  | to stop), or it will stop on its own before reaching the top. This is | ||||||
|  | the essence of the fixed-point algorithm, which in Agda-like pseudocode can | ||||||
|  | be stated as follows: | ||||||
|  | 
 | ||||||
|  | ```Agda | ||||||
|  | module _ (IsFiniteHeight A ≺) | ||||||
|  |          (f : A → A) | ||||||
|  |          (Monotonicᶠ : Monotonic _≼_ _≼_ f) where | ||||||
|  |     -- There exists a point... | ||||||
|  |     aᶠ : A | ||||||
|  | 
 | ||||||
|  |     -- Such that applying the monotonic function doesn't change the result. | ||||||
|  |     aᶠ≈faᶠ : aᶠ ≈ f aᶠ | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Moreover, the value we'll get out of the fixed point algorithm will be | ||||||
|  | the _least fixed point_. For us, this means that the result will be | ||||||
|  | "the most specific result possible". | ||||||
|  | 
 | ||||||
|  | {{< codelines "Agda" "agda-spa/Fixedpoint.agda" 80 80 >}} | ||||||
|  | 
 | ||||||
|  | The above explanation omits a lot of details, but it's a start. To get more | ||||||
|  | precise, we must drill down into several aspects of what I've said so far. | ||||||
|  | The first of them is, __how can we compare program information using an order?__ | ||||||
|  | 
 | ||||||
|  | ### Lattices | ||||||
|  | 
 | ||||||
|  | Let's start with a question: when it comes to our specificity-based order, | ||||||
|  | is `-` less than, greater than, or equal to `+`? Surely it's not less specific; | ||||||
|  | knowing that a number is negative doesn't give you less information than | ||||||
|  | knowing if that number is positive. Similarly, it's not any more specific, for | ||||||
|  | the same reason. You could consider it equally specific, but that doesn't | ||||||
|  | seem quite right either; the information is different, so comparing specificity | ||||||
|  | feels apples-to-oranges. On the other hand, both `+` and `-` are clearly | ||||||
|  | more specific than `unknown`. | ||||||
|  | 
 | ||||||
|  | The solution to this conundrum is to simply refuse to compare certain elements: | ||||||
|  | `+` is neither less than, greater than, nor equal to `-`, but `+ < unknown` and | ||||||
|  | `- < unknown`. Such an ordering is called a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set). | ||||||
|  | 
 | ||||||
|  | Next, another question. Suppose that the user writes code like this: | ||||||
|  | 
 | ||||||
|  | ``` | ||||||
|  | if someCondition { | ||||||
|  |   x = exprA; | ||||||
|  | } else { | ||||||
|  |   x = exprB; | ||||||
|  | } | ||||||
|  | y = x; | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | If `exprA` has sign `s1`, and `exprB` has sign `s2`, what's the sign of `y`? | ||||||
|  | It's not necessarily `s1` nor `s2`, since they might not match: `s1` could be `+`, | ||||||
|  | and `s2` could be `-`, and using either `+` or `-` for `y` would be incorrect. | ||||||
|  | We're looking for something that can encompass _both_ `s1` and `s2`. | ||||||
|  | Necessarily, it would be either equally specific or less specific than | ||||||
|  | either `s1` or `s2`: there isn't any new information coming in about `x`, | ||||||
|  | and since we don't know which branch is taken, we stand to lose a little | ||||||
|  | bit of info. However, our goal is always to maximize specificity, since | ||||||
|  | more specific signs give us more information about our program. | ||||||
|  | 
 | ||||||
|  | This gives us the following constraints. Since the combined sign `s` has to | ||||||
|  | be equally or less specific than either `s1` and `s2`, we have `s1 <= s` and | ||||||
|  | `s2 <= s`. However, we want to pick `s` such that it's more specific than | ||||||
|  | any other "combined sign" candidate. Thus, if there's another sign `t`, | ||||||
|  | with `s1 <= t` and `s2 <= t`, then it must be less specific than `s`: `s <= t`. | ||||||
|  | 
 | ||||||
|  | At first, the above constraints might seem quite complicated. We can interpret | ||||||
|  | them in more familiar territory by looking at numbers instead of signs. | ||||||
|  | If we have two numbers `n1` and `n2`, what number is the smallest number | ||||||
|  | that's bigger than either `n1` or `n2`? Why, the maximum of the two, of course! | ||||||
|  | 
 | ||||||
|  | There is a reason why I used the constraints above instead of just saying | ||||||
|  | "maximum". For numbers, `max(a,b)` is either `a` or `b`. However, we saw earlier | ||||||
|  | that neither `+` nor `-` works as the sign for `y` in our program. Moreover, | ||||||
|  | we agreed above that our order is _partial_: how can we pick "the bigger of two | ||||||
|  | elements" if neither is bigger than the other? `max` itself doesn't quite work, | ||||||
|  | but what we're looking for is something similar. Instead, we simply require a | ||||||
|  | similar function for our signs. We call this function "[least upper bound](https://en.wikipedia.org/wiki/Least-upper-bound_property)", | ||||||
|  | since it is the "least (most specific) | ||||||
|  | element that's greater (less specific) than either `s1` or `s2`". Conventionally, | ||||||
|  | this function is written as \\(a \\sqcup b\\) (or in our case, \\(s_1 \\sqcup s_2\\)). | ||||||
|  | We can define it for our signs so far using the following [Cayley table](https://en.wikipedia.org/wiki/Cayley_table). | ||||||
|  | 
 | ||||||
|  | {{< latex >}} | ||||||
|  | \begin{array}{c|cccc} | ||||||
|  |  \sqcup & - & 0 & + & ? \\ | ||||||
|  |  \hline | ||||||
|  |  - & - & ? & ? & ? \\ | ||||||
|  |  0 & ? & 0 & ? & ? \\ | ||||||
|  |  + & ? & ? & + & ? \\ | ||||||
|  |  ? & ? & ? & ? & ? \\ | ||||||
|  | \end{array} | ||||||
|  | {{< /latex >}} | ||||||
|  | 
 | ||||||
|  | By using the above table, we can see that \\((+\ \\sqcup\ -)\ =\ ?\\) (aka `unknown`). | ||||||
|  | This is correct; given the four signs we're working with, that's the most we can say. | ||||||
|  | Let's explore the analogy to the `max` function a little bit more, by observing | ||||||
|  | that this function has certain properties: | ||||||
|  | 
 | ||||||
|  | * `max(a, a) = a`. The maximum of one number is just that number. | ||||||
|  |   Mathematically, this property is called _idempotence_. Note that | ||||||
|  |   by inspecting the diagonal of the above table, we can confirm that our | ||||||
|  |   \\((\\sqcup)\\) function is idempotent. | ||||||
|  | * `max(a, b) = max(b, a)`. If you're taking the maximum of two numbers, | ||||||
|  |   it doesn't matter which one you consider first. This property is called | ||||||
|  |   _commutativity_. Note that if you mirror the table along the diagonal, | ||||||
|  |   it doesn't change; this shows that our \\((\\sqcup)\\) function is | ||||||
|  |   commutative. | ||||||
|  | * `max(a, max(b, c)) = max(max(a, b), c)`. When you have three numbers, | ||||||
|  |   and you're determining the maximum value, it doesn't matter which pair of | ||||||
|  |   numbers you compare first. This property is called _associativity_. You | ||||||
|  |   can use the table above to verify the \\((\\sqcup)\\) is associative, too. | ||||||
|  | 
 | ||||||
|  | A set that has a binary operation (like `max` or \\((\\sqcup)\\)) that | ||||||
|  | satisfies the above properties is called a [semilattice](https://en.wikipedia.org/wiki/Semilattice). In Agda, we can write this definition roughly as follows: | ||||||
|  | 
 | ||||||
|  | ```Agda | ||||||
|  | record IsSemilattice {a} (A : Set a) (_⊔_ : A → A → A) : Set a where | ||||||
|  |     field | ||||||
|  |         ⊔-assoc : (x y z : A) → ((x ⊔ y) ⊔ z) ≡ (x ⊔ (y ⊔ z)) | ||||||
|  |         ⊔-comm : (x y : A) → (x ⊔ y) ≡ (y ⊔ x) | ||||||
|  |         ⊔-idemp : (x : A) → (x ⊔ x) ≡ x | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | Note that this is an example of the ["Is Something" pattern]({{< relref "agda_is_pattern" >}}). | ||||||
|  | It turns out to be convenient, however, to not require definitional equality | ||||||
|  | (`≡`). For instance, we might model sets as lists. Definitional equality | ||||||
|  | would force us to consider lists with the same elements but a different | ||||||
|  | order to be unequal. Instead, we parameterize our definition of `IsSemilattice` | ||||||
|  | by a binary relation `_≈_`, which we ask to be an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation). | ||||||
|  | 
 | ||||||
|  | {{< codelines "Agda" "agda-spa/Lattice.agda" 23 39 >}} | ||||||
|  | 
 | ||||||
|  | Notice that the above code also provides -- but doesn't require -- `_≼_` and | ||||||
|  | `_≺_`. That's because a least-upper-bound operation encodes an order: | ||||||
|  | intuitively, if `max(a, b) = b`, then `b` must be larger than `a`. | ||||||
|  | Lars Hupel's CRDT series includes [an explanation](https://lars.hupel.info/topics/crdt/03-lattices/#there-) of how the ordering operator | ||||||
|  | and the "least upper bound" function can be constructed from one another. | ||||||
|  | 
 | ||||||
|  | As it turns out, the `min` function has very similar properties to `max`: | ||||||
|  | it's idempotent, commutative, and associative. For a partial order like | ||||||
|  | ours, the analog to `min` is "greatest lower bound", or "the largest value | ||||||
|  | that's smaller than both inputs". Such a function is denoted as \\(a\\sqcap b\\). | ||||||
|  | As for what it means, where \\(s_1 \\sqcup s_2\\) means "combine two signs where | ||||||
|  | you don't know which one will be used" (like in an `if`/`else`),  | ||||||
|  | \\(s_1 \\sqcap s_2\\) means "combine two signs where you know both of | ||||||
|  | them to be true". For example, \\((+\ \\sqcap\ ?)\ =\ +\\), because a variable | ||||||
|  | that's both "any sign" and "positive" must be positive. | ||||||
|  | 
 | ||||||
|  | There's just one hiccup: what's the greatest lower bound of `+` and `-`? | ||||||
|  | it needs to be a value that's less than both of them, but so far, we don't have | ||||||
|  | such a value. Intuitively, this value should be called something like `impossible`, | ||||||
|  | because a number that's both positive and negative doesn't exist. So, let's | ||||||
|  | extend our analyzer to have a new `impossible` value. In fact, it turns | ||||||
|  | out that this "impossible" value is the least element of our set (we added | ||||||
|  | it to be the lower bound of `+` and co., which in turn are less than `unknown`). | ||||||
|  | Similarly, `unknown` is the largest element of our set, since it's greater | ||||||
|  | than `+` and co, and transitively greater than `impossible`. In mathematics, | ||||||
|  | it's not uncommon to define the least element as \\(\\bot\\) (read "bottom"), and the | ||||||
|  | greatest element as \\(\\top\\) (read "top"). With that in mind, the | ||||||
|  | following are the updated Cayley tables for our operations. | ||||||
|  | 
 | ||||||
|  | {{< latex >}} | ||||||
|  | \begin{array}{c|ccccc} | ||||||
|  |  \sqcup & - & 0 & + & \top & \bot \\ | ||||||
|  |  \hline | ||||||
|  |  - & - & \top & \top & \top & - \\ | ||||||
|  |  0 & \top & 0 & \top & \top & 0 \\ | ||||||
|  |  + & \top & \top & + & \top & + \\ | ||||||
|  |  \top & \top & \top & \top & \top & \top \\ | ||||||
|  |  \bot & - & 0 & + & \top & \bot \\ | ||||||
|  | \end{array} | ||||||
|  | 
 | ||||||
|  | \qquad | ||||||
|  | 
 | ||||||
|  | \begin{array}{c|ccccc} | ||||||
|  |  \sqcap & - & 0 & + & \top & \bot \\ | ||||||
|  |  \hline | ||||||
|  |  - & - & \bot & \bot & - & \bot \\ | ||||||
|  |  0 & \bot & 0 & \bot & 0 & \bot \\ | ||||||
|  |  + & \bot & \bot & + & + & \bot \\ | ||||||
|  |  \top & - & 0 & + & \top & \bot \\ | ||||||
|  |  \bot & \bot & \bot & \bot & \bot & \bot \\ | ||||||
|  | \end{array} | ||||||
|  | {{< /latex >}} | ||||||
|  | 
 | ||||||
|  | So, it turns out that our set of possible signs is a semilattice in two | ||||||
|  | ways. And if "semi" means "half", does two "semi"s make a whole? Indeed it does! | ||||||
|  | 
 | ||||||
|  | A lattice is made up of two semilattices. The operations of these two lattices, | ||||||
|  | however, must satisfy some additional properties. Let's examine the properties | ||||||
|  | in the context of `min` and `max` as we have before. They are usually called  | ||||||
|  | the _absorption laws_: | ||||||
|  | 
 | ||||||
|  | * `max(a, min(a, b)) = a`. `a` is either less than or bigger than `b`; | ||||||
|  |   so if you try to find the maximum __and__ the minimum of `a` and | ||||||
|  |   `b`, one of the operations will return `a`. | ||||||
|  | * `min(a, max(a, b)) = a`. The reason for this one is the same as | ||||||
|  |   the reason above. | ||||||
|  | 
 | ||||||
|  | In Agda, we can therefore write a lattice as follows: | ||||||
|  | 
 | ||||||
|  | {{< codelines "Agda" "agda-spa/Lattice.agda" 153 163 >}} | ||||||
|  | 
 | ||||||
|  | ### Concrete Example: Natural Numbers | ||||||
|  | 
 | ||||||
|  | Since we've been talking about `min` and `max` as motivators for properties | ||||||
|  | of \\((\\sqcap)\\) and \\((\\sqcup)\\), it might not be all that surprising | ||||||
|  | that natural numbers form a lattice with `min` and `max` as the two binary | ||||||
|  | operations. In fact, the Agda standard library writes `min` as `_⊓_` and | ||||||
|  | `max` as `_⊔_`! We can make use of the already-proven properties of these | ||||||
|  | operators to easily define `IsLattice` for natural numbers. Notice that | ||||||
|  | since we're not doing anything clever, like considering lists up to reordering, | ||||||
|  | there's no reason not to use definitional equality `≡` for our equivalence relation. | ||||||
|  | 
 | ||||||
|  | {{< codelines "Agda" "agda-spa/Lattice/Nat.agda" 1 45 >}} | ||||||
|  | 
 | ||||||
|  | The definition for the lattice instance itself is pretty similar; I'll omit | ||||||
|  | it here to avoid taking up a lot of vertical space, but you can find it | ||||||
|  | on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}). | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user