From c6e2ecb996fe928ab38f868b0083b40931072e5d Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Thu, 14 Mar 2024 22:25:17 -0700 Subject: [PATCH] Add first section of Agda program analysis article Signed-off-by: Danila Fedorin --- content/blog/00_spa_agda.md | 109 ++++++++++++++++++ .../static-program-analysis-in-agda/_index.md | 11 ++ 2 files changed, 120 insertions(+) create mode 100644 content/blog/00_spa_agda.md create mode 100644 content/series/static-program-analysis-in-agda/_index.md diff --git a/content/blog/00_spa_agda.md b/content/blog/00_spa_agda.md new file mode 100644 index 0000000..bbfc27d --- /dev/null +++ b/content/blog/00_spa_agda.md @@ -0,0 +1,109 @@ +--- +title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices" +series: "Static Program Analysis in Agda" +date: 2024-03-11T14:23:03-07:00 +draft: true +--- + +Some years ago, when the Programming Languages research group at Oregon State +University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/) +lecture notes came up. The group didn't end up reading the lecture notes, +but I did. As I was going through them, I noticed that they were quite rigorous: +the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)), +and the subsequent analyses -- and the descriptions thereof -- are quite precise. +When I went to implement the algorithms in the textbook, I realized that just +writing them down would not be enough. After all, the textbook also proves +several properties of the lattice-based analyses, which would be lost in +translation if I were to just write C++ or Haskell. + +At the same time, I noticed that lots of recent papers in programming language +theory were formalizing their results in +[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html). +Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}}) +[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}}) +[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out. +Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_ +in Agda. + +In this post (and subsequent follow-ups), I hope to describe what I have so far. +Although not everything works, formalizing fixed-point algorithms (I'll get back +to this) and getting even the most straightforward analysis working was no +joke, and I think is worthy of discussion. So, here goes. + +### Monotone Frameworks + +I'll start out as abstractly and vaguely as possible. In general, the sorts of +analyses I'll be formalizing are based on _monotone frameworks_. +The idea with monotone frameworks is to rank information about program state +using some kind of _order_. Intuitively, given two pieces of "information", +one is less than another if it's more specific. Thus, "`x` has a positive sign" +is less than "`x` has any sign", since the former is more specific than the latter. +The sort of information that you are comparing depends on the analysis. In all +cases, the analysis itself is implemented as a function that takes the "information +so far", and updates it based on the program, producing "updated information so far". +Not all such functions are acceptable; it's possible to write an "updater function" +that keeps slightly adjusting its answer. Such a function could keep running +forever, which is a little too long for a program analyzer. We need something +to ensure the analysis ends. + +There are two secret ingredients to ensure that an analysis terminates. +The first is a property called _monotonicity_; a function is monotonic if +it preserves the order between its inputs. That is, if you have two pieces of +information `x1` and `x2`, with `x1 <= x2`, then `f(x1) <= f(x2)`. The second +property is that our "information" has a _finite height_. Roughly, this means +that if you tried to arrange pieces of information in a line, from least to +greatest, your line could only get so long. Combined, this leads to the +following property (I'm being reductive here while I give an overview): + +_With a monotoninc function and a finite-height order, if you start at the +bottom, each invocation of the function moves you up some line. Since the +line can only be so long, you're guaranteed to reach the end eventually._ + +The above three-paragraph explanation omits a lot of details, but it's a start. +To get more precise, we must drill down into several aspects of what I've +said so far. The first of them is, __how can we compare program states using +an order?__ + +### Lattices + +The "information" we'll be talking about will form an algebraic structure +called a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Algebraically, +a lattice is simply a set with two binary operations on it. Unlike the familiar +`+`, `-`, and `*` and `/`, the binary operations on a lattice are called +"join" and "meet", and are written as `⊔` and `⊓`. Intuitively, they correspond +to "take the maximum of two values" and "take the minimum of two values". That +may not be all that surprising, since it's the order of values that we care about. +Continuing the analogy, let's talk some properties of "minimum" and "maximum", + +* \\(\\max(a, a) = \\min(a, a) = a\\). The minimum and maximum of one number is + just that number. Mathematically, this property is called _idempotence_. +* \\(\\max(a, b) = \\max(b, a)\\). If you're taking the maximum of two numbers, + it doesn't matter much one you consider first. This property is called + _commutativity_. +* \\(\\max(a, \\max(b, c)) = \\max(\\max(a, b), c)\\). When you have three numbers, + and you're determining the maximum value, it doesn't matter which pair of + numbers you compare first. This property is called _associativity_. + +All of the properties of \\(\\max\\) also hold for \\(\\min\\). There are also +a couple of facts about how \\(\\max\\) and \\(\\min\\) interact _with each other_. +They are usually called the _absorption laws_: + +* \\(\\max(a, \\min(a, b)) = a\\). This one is a little less obvious; \\(a\\) + is either less than or bigger than \\(b\\); so if you try to find the maximum + __and__ the minimum of \\(a\\) and \\(b\\), one of the operations will return + \\(a\\). +* \\(\\min(a, \\max(a, b)) = a\\). The reason for this one is the same as + the reason above. + +Lattices model a specific kind of order; their operations are meant to +generalize \\(\\min\\) and \\(\\max\\). Thus, to make the operations behave +as expected (i.e., the way that \\(\\min\\) and \\(\\max\\) do), they are +required to have all of the properties we've listed so far. We can summarize +the properties in table. + +| Property Name | Definition | +|---------------|:----------------------------------------------------:| +| Idempotence | {{< latex >}}\forall x. x \sqcup x = x{{< /latex >}}
{{< latex >}}\forall x. x \sqcap x = x{{< /latex >}} | +| Commutativity | {{< latex >}}\forall x, y. x \sqcup y = y \sqcup x{{< /latex >}}
{{< latex >}}\forall x, y. x \sqcap y = y \sqcap x{{< /latex >}} | +| Associativity | {{< latex >}}\forall x, y, z. x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z{{< /latex >}}
{{< latex >}}\forall x, y, z. x \sqcap (y \sqcap z) = (x \sqcap y) \sqcap z{{< /latex >}} | +| Absorption Laws | {{< latex >}}\forall x, y. x \sqcup (x \sqcap y) = x{{< /latex >}}
{{< latex >}}\forall x, y. x \sqcap (x \sqcup y) = x{{< /latex >}} diff --git a/content/series/static-program-analysis-in-agda/_index.md b/content/series/static-program-analysis-in-agda/_index.md new file mode 100644 index 0000000..59cefac --- /dev/null +++ b/content/series/static-program-analysis-in-agda/_index.md @@ -0,0 +1,11 @@ ++++ +title = "Implementing and Verifying \"Static Program Analysis\" in Agda" +summary = """ + In this series, I attempt to formalize the first few chapters of + [Static Program Analysis](https://cs.au.dk/~amoeller/spa/) + in Agda. The goal is to have a formally verified, yet executable, static + analyzer for a simple language. + """ +status = "ongoing" +draft = true ++++