blog-static/content/blog/00_spa_agda_intro.md

---
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
series: "Static Program Analysis in Agda"
date: 2024-04-12T14:23:03-07:00
draft: true
---

Some years ago, when the Programming Languages research group at Oregon State
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
lecture notes came up. The group didn't end up reading the lecture notes,
but I did. As I was going through them, I noticed that they were quite rigorous:
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
When I went to implement the algorithms in the textbook, I realized that just
writing them down would not be enough. After all, the textbook also proves
several properties of the lattice-based analyses, which would be lost in
translation if I were to just write C++ or Haskell.

At the same time, I noticed that lots of recent papers in programming language
theory were formalizing their results in
[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
in Agda.

In all, I built a framework for static analyses, based on a tool
called _motone functions_. This framework can be used to implement and
reason about many different analyses (currently only a certain class called
_forward analyses_, but that's not hard limitation). Recently, I've proven
the _correctness_ of the algorithms my framework produces. Having reached
this milestone, I'd like to pause and talk about what I've done.

In subsequent posts in this series, will describe what I have so far.
It's not perfect, and some work is yet to be done; however, getting to
this point was no joke, and I think it's worth discussing. In all,
I envision three major topics to cover, each of which is likely going to make
for a post or two:

* __Lattices__: the analyses I'm reasoning about use an algebraic structure
  called a _lattice_. This structure has certain properties that make it
  amenable to describing degrees of "knowledge" about a program. In
  lattice-based static program analysis, the various elements of the
  lattice represent different facts or properties that we know about the
  program in question; operations on the lattice help us combine these facts
  and reason about them.

  Interestingly, lattices can be made by combining other lattices in certain
  ways. We can therefore use simpler lattices as building blocks to create
  more complex ones, all while preserving the algebraic structure that
  we need for program analysis.

* __The Fixed-Point Algorithm__: to analyze a program, we use information
  that we already know to compute additional information. For instance,
  we might use the fact that `1` is positive to compute the fact that
  `1+1` is positive as well. Using that information, we can determine the
  sign of `(1+1)+1`, and so on. In practice, this is often done by calling
  some kind of "analyze" function over and over, each time getting closer to an
  accurate characterization of the program's behavior. When the output of "analyze"
  stops changing, we know we've found as much as we can find, and stop.

  What does it mean for the output to stop changing? Roughly, that's when
  the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
  this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
  Crucially, not all functions have fixed points; however, certain types of
  functions on lattices do. The fixed-point algorithm is a way to compute these
  points, and we will use this to drive our analyses.

* __Correctness__: putting together the work on lattices and the fixed-point
  algorithm, we can implement a static program analyzer in Agda. However,
  it's not hard to write an "analyze" function that has a fixed point but
  produces an incorrect result. Thus, the next step is to prove that the results
  of our analyzer accurately describe the program in question.

  The interesting aspect of this step is that our program analyzer works
  on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
  which are a relatively compiler-centric representation of programs. On the
  other hand, what the language _actually does_ is defined by its
  [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
  which is not at all compiler-centric. We need to connect these two, showing
  that the CFGs we produce "make sense" for our language, and that given
  CFGs that make sense, our analysis produces results that match the language's
  execution.

{{< todo >}}
Once the posts are ready, link them here to add some kind of navigation.
{{< /todo >}}

### Monotone Frameworks

I'll start out as abstractly and vaguely as possible. In general, the sorts of
analyses I'll be formalizing are based on _monotone frameworks_.
The idea with monotone frameworks is to rank information about program state
using some kind of _order_. Intuitively, given two pieces of "information",
one is less than another if it's more specific. Thus, "`x` has a positive sign"
is less than "`x` has any sign", since the former is more specific than the latter.
The sort of information that you are comparing depends on the analysis. In all
cases, the analysis itself is implemented as a function that takes the "information
so far", and updates it based on the program, producing "updated information so far".
Not all such functions are acceptable; it's possible to write an "updater function"
that keeps slightly adjusting its answer. Such a function could keep running
forever, which is a little too long for a program analyzer. We need something
to ensure the analysis ends.

There are two secret ingredients to ensure that an analysis terminates.
The first is a property called _monotonicity_; a function is monotonic if
it preserves the order between its inputs. That is, if you have two pieces of
information `x1` and `x2`, with `x1 <= x2`, then `f(x1) <= f(x2)`. The second
property is that our "information" has a _finite height_. Roughly, this means
that if you tried to arrange pieces of information in a line, from least to
greatest, your line could only get so long. Combined, this leads to the
following property (I'm being reductive here while I give an overview):

_With a monotoninc function and a finite-height order, if you start at the
bottom, each invocation of the function moves you up some line. Since the
line can only be so long, you're guaranteed to reach the end eventually._

The above three-paragraph explanation omits a lot of details, but it's a start.
To get more precise, we must drill down into several aspects of what I've
said so far. The first of them is, __how can we compare program states using
an order?__

### Lattices

The "information" we'll be talking about will form an algebraic structure
called a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Algebraically,
a lattice is simply a set with two binary operations on it. Unlike the familiar
`+`, `-`, and `*` and `/`, the binary operations on a lattice are called
"join" and "meet", and are written as `⊔` and `⊓`. Intuitively, they correspond
to "take the maximum of two values" and "take the minimum of two values". That
may not be all that surprising, since it's the order of values that we care about.
Continuing the analogy, let's talk some properties of "minimum" and "maximum", 

* \\(\\max(a, a) = \\min(a, a) = a\\). The minimum and maximum of one number is
  just that number. Mathematically, this property is called _idempotence_.
* \\(\\max(a, b) = \\max(b, a)\\). If you're taking the maximum of two numbers,
  it doesn't matter much one you consider first. This property is called
  _commutativity_.
* \\(\\max(a, \\max(b, c)) = \\max(\\max(a, b), c)\\). When you have three numbers,
  and you're determining the maximum value, it doesn't matter which pair of
  numbers you compare first. This property is called _associativity_.

All of the properties of \\(\\max\\) also hold for \\(\\min\\). There are also
a couple of facts about how \\(\\max\\) and \\(\\min\\) interact _with each other_.
They are usually called the _absorption laws_:

* \\(\\max(a, \\min(a, b)) = a\\). This one is a little less obvious; \\(a\\)
  is either less than or bigger than \\(b\\); so if you try to find the maximum
  __and__ the minimum of \\(a\\) and \\(b\\), one of the operations will return
  \\(a\\).
* \\(\\min(a, \\max(a, b)) = a\\). The reason for this one is the same as
  the reason above.

Lattices model a specific kind of order; their operations are meant to
generalize \\(\\min\\) and \\(\\max\\). Thus, to make the operations behave
as expected (i.e., the way that \\(\\min\\) and \\(\\max\\) do), they are
required to have all of the properties we've listed so far. We can summarize
the properties in table.

| Property Name | Definition                                           |
|---------------|:----------------------------------------------------:|
| Idempotence   | {{< latex >}}\forall x. x \sqcup x = x{{< /latex >}}<br>{{< latex >}}\forall x. x \sqcap x = x{{< /latex >}} |
| Commutativity | {{< latex >}}\forall x, y. x \sqcup y = y \sqcup x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap y = y \sqcap x{{< /latex >}} |
| Associativity | {{< latex >}}\forall x, y, z. x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z{{< /latex >}}<br>{{< latex >}}\forall x, y, z. x \sqcap (y \sqcap z) = (x \sqcap y) \sqcap z{{< /latex >}} |
| Absorption Laws | {{< latex >}}\forall x, y. x \sqcup (x \sqcap y) = x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap (x \sqcup y) = x{{< /latex >}}
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00			`---`
Write a high-level introduction for the SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-05-12 13:49:21 -07:00			`title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"`
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00			`series: "Static Program Analysis in Agda"`
Write a high-level introduction for the SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-05-12 13:49:21 -07:00			`date: 2024-04-12T14:23:03-07:00`
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00			`draft: true`
			`---`

			`Some years ago, when the Programming Languages research group at Oregon State`
			`University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)`
Write a high-level introduction for the SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-05-12 13:49:21 -07:00			`lecture notes came up. The group didn't end up reading the lecture notes,`
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00			`but I did. As I was going through them, I noticed that they were quite rigorous:`
			`the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),`
			`and the subsequent analyses -- and the descriptions thereof -- are quite precise.`
			`When I went to implement the algorithms in the textbook, I realized that just`
			`writing them down would not be enough. After all, the textbook also proves`
			`several properties of the lattice-based analyses, which would be lost in`
			`translation if I were to just write C++ or Haskell.`

			`At the same time, I noticed that lots of recent papers in programming language`
			`theory were formalizing their results in`
			`[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).`
			`Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})`
			`[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})`
			`[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.`
Write a high-level introduction for the SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-05-12 13:49:21 -07:00			`Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_`
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00			`in Agda.`

Write a high-level introduction for the SPA series Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-05-12 13:49:21 -07:00			`In all, I built a framework for static analyses, based on a tool`
			`called _motone functions_. This framework can be used to implement and`
			`reason about many different analyses (currently only a certain class called`
			`_forward analyses_, but that's not hard limitation). Recently, I've proven`
			`the _correctness_ of the algorithms my framework produces. Having reached`
			`this milestone, I'd like to pause and talk about what I've done.`

			`In subsequent posts in this series, will describe what I have so far.`
			`It's not perfect, and some work is yet to be done; however, getting to`
			`this point was no joke, and I think it's worth discussing. In all,`
			`I envision three major topics to cover, each of which is likely going to make`
			`for a post or two:`

			`* __Lattices__: the analyses I'm reasoning about use an algebraic structure`
			`called a _lattice_. This structure has certain properties that make it`
			`amenable to describing degrees of "knowledge" about a program. In`
			`lattice-based static program analysis, the various elements of the`
			`lattice represent different facts or properties that we know about the`
			`program in question; operations on the lattice help us combine these facts`
			`and reason about them.`

			`Interestingly, lattices can be made by combining other lattices in certain`
			`ways. We can therefore use simpler lattices as building blocks to create`
			`more complex ones, all while preserving the algebraic structure that`
			`we need for program analysis.`

			`* __The Fixed-Point Algorithm__: to analyze a program, we use information`
			`that we already know to compute additional information. For instance,`
			we might use the fact that `1` is positive to compute the fact that
			`1+1` is positive as well. Using that information, we can determine the
			sign of `(1+1)+1`, and so on. In practice, this is often done by calling
			`some kind of "analyze" function over and over, each time getting closer to an`
			`accurate characterization of the program's behavior. When the output of "analyze"`
			`stops changing, we know we've found as much as we can find, and stop.`

			`What does it mean for the output to stop changing? Roughly, that's when`
			the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
			`this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).`
			`Crucially, not all functions have fixed points; however, certain types of`
			`functions on lattices do. The fixed-point algorithm is a way to compute these`
			`points, and we will use this to drive our analyses.`

			`* __Correctness__: putting together the work on lattices and the fixed-point`
			`algorithm, we can implement a static program analyzer in Agda. However,`
			`it's not hard to write an "analyze" function that has a fixed point but`
			`produces an incorrect result. Thus, the next step is to prove that the results`
			`of our analyzer accurately describe the program in question.`

			`The interesting aspect of this step is that our program analyzer works`
			`on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),`
			`which are a relatively compiler-centric representation of programs. On the`
			`other hand, what the language _actually does_ is defined by its`
			`[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),`
			`which is not at all compiler-centric. We need to connect these two, showing`
			`that the CFGs we produce "make sense" for our language, and that given`
			`CFGs that make sense, our analysis produces results that match the language's`
			`execution.`

			`{{< todo >}}`
			`Once the posts are ready, link them here to add some kind of navigation.`
			`{{< /todo >}}`
Add first section of Agda program analysis article Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com> 2024-03-14 22:25:17 -07:00
			`### Monotone Frameworks`

			`I'll start out as abstractly and vaguely as possible. In general, the sorts of`
			`analyses I'll be formalizing are based on _monotone frameworks_.`
			`The idea with monotone frameworks is to rank information about program state`
			`using some kind of _order_. Intuitively, given two pieces of "information",`
			one is less than another if it's more specific. Thus, "`x` has a positive sign"
			is less than "`x` has any sign", since the former is more specific than the latter.
			`The sort of information that you are comparing depends on the analysis. In all`
			`cases, the analysis itself is implemented as a function that takes the "information`
			`so far", and updates it based on the program, producing "updated information so far".`
			`Not all such functions are acceptable; it's possible to write an "updater function"`
			`that keeps slightly adjusting its answer. Such a function could keep running`
			`forever, which is a little too long for a program analyzer. We need something`
			`to ensure the analysis ends.`

			`There are two secret ingredients to ensure that an analysis terminates.`
			`The first is a property called _monotonicity_; a function is monotonic if`
			`it preserves the order between its inputs. That is, if you have two pieces of`
			information `x1` and `x2`, with `x1 <= x2`, then `f(x1) <= f(x2)`. The second
			`property is that our "information" has a _finite height_. Roughly, this means`
			`that if you tried to arrange pieces of information in a line, from least to`
			`greatest, your line could only get so long. Combined, this leads to the`
			`following property (I'm being reductive here while I give an overview):`

			`_With a monotoninc function and a finite-height order, if you start at the`
			`bottom, each invocation of the function moves you up some line. Since the`
			`line can only be so long, you're guaranteed to reach the end eventually._`

			`The above three-paragraph explanation omits a lot of details, but it's a start.`
			`To get more precise, we must drill down into several aspects of what I've`
			`said so far. The first of them is, __how can we compare program states using`
			`an order?__`

			`### Lattices`

			`The "information" we'll be talking about will form an algebraic structure`
			`called a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Algebraically,`
			`a lattice is simply a set with two binary operations on it. Unlike the familiar`
			`+`, `-`, and `*` and `/`, the binary operations on a lattice are called
			"join" and "meet", and are written as `⊔` and `⊓`. Intuitively, they correspond
			`to "take the maximum of two values" and "take the minimum of two values". That`
			`may not be all that surprising, since it's the order of values that we care about.`
			`Continuing the analogy, let's talk some properties of "minimum" and "maximum",`

			`* \\(\\max(a, a) = \\min(a, a) = a\\). The minimum and maximum of one number is`
			`just that number. Mathematically, this property is called _idempotence_.`
			`* \\(\\max(a, b) = \\max(b, a)\\). If you're taking the maximum of two numbers,`
			`it doesn't matter much one you consider first. This property is called`
			`_commutativity_.`
			`* \\(\\max(a, \\max(b, c)) = \\max(\\max(a, b), c)\\). When you have three numbers,`
			`and you're determining the maximum value, it doesn't matter which pair of`
			`numbers you compare first. This property is called _associativity_.`

			`All of the properties of \\(\\max\\) also hold for \\(\\min\\). There are also`
			`a couple of facts about how \\(\\max\\) and \\(\\min\\) interact _with each other_.`
			`They are usually called the _absorption laws_:`

			`* \\(\\max(a, \\min(a, b)) = a\\). This one is a little less obvious; \\(a\\)`
			`is either less than or bigger than \\(b\\); so if you try to find the maximum`
			`__and__ the minimum of \\(a\\) and \\(b\\), one of the operations will return`
			`\\(a\\).`
			`* \\(\\min(a, \\max(a, b)) = a\\). The reason for this one is the same as`
			`the reason above.`

			`Lattices model a specific kind of order; their operations are meant to`
			`generalize \\(\\min\\) and \\(\\max\\). Thus, to make the operations behave`
			`as expected (i.e., the way that \\(\\min\\) and \\(\\max\\) do), they are`
			`required to have all of the properties we've listed so far. We can summarize`
			`the properties in table.`

			`\| Property Name \| Definition \|`
			`\|---------------\|:----------------------------------------------------:\|`
			`\| Idempotence \| {{< latex >}}\forall x. x \sqcup x = x{{< /latex >}}<br>{{< latex >}}\forall x. x \sqcap x = x{{< /latex >}} \|`
			`\| Commutativity \| {{< latex >}}\forall x, y. x \sqcup y = y \sqcup x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap y = y \sqcap x{{< /latex >}} \|`
			`\| Associativity \| {{< latex >}}\forall x, y, z. x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z{{< /latex >}}<br>{{< latex >}}\forall x, y, z. x \sqcap (y \sqcap z) = (x \sqcap y) \sqcap z{{< /latex >}} \|`
			`\| Absorption Laws \| {{< latex >}}\forall x, y. x \sqcup (x \sqcap y) = x{{< /latex >}}<br>{{< latex >}}\forall x, y. x \sqcap (x \sqcup y) = x{{< /latex >}}`