blog-static/content/blog/00_spa_agda_intro.md

---
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
series: "Static Program Analysis in Agda"
description: "In this post, I give a top-level overview of my work on formally verified static analyses"
date: 2024-07-06T17:37:42-07:00
tags: ["Agda", "Programming Languages"]
---

Some years ago, when the Programming Languages research group at Oregon State
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
lecture notes came up. The group didn't end up reading the lecture notes,
but I did. As I was going through them, I noticed that they were quite rigorous:
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
When I went to implement the algorithms in the textbook, I realized that just
writing them down would not be enough. After all, the textbook also proves
several properties of the lattice-based analyses, which would be lost in
translation if I were to just write C++ or Haskell.

At the same time, I noticed that lots of recent papers in programming language
theory were formalizing their results in
[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
in Agda.

In all, I built a framework for static analyses, based on a tool
called _motone functions_. This framework can be used to implement and
reason about many different analyses (currently only a certain class called
_forward analyses_, but that's not hard limitation). Recently, I've proven
the _correctness_ of the algorithms my framework produces. Having reached
this milestone, I'd like to pause and talk about what I've done.

In subsequent posts in this series, will describe what I have so far.
It's not perfect, and some work is yet to be done; however, getting to
this point was no joke, and I think it's worth discussing. In all,
I'd like to cover the following major topics, spending a couple of posts on each:

* __Lattices__: the analyses I'm reasoning about use an algebraic structure
  called a _lattice_. This structure has certain properties that make it
  amenable to describing degrees of "knowledge" about a program. In
  lattice-based static program analysis, the various elements of the
  lattice represent different facts or properties that we know about the
  program in question; operations on the lattice help us combine these facts
  and reason about them. I write about this in {{< draftlink "Part 1: Lattices" "01_spa_agda_lattices" >}}.

  Interestingly, lattices can be made by combining other lattices in certain
  ways. We can therefore use simpler lattices as building blocks to create
  more complex ones, all while preserving the algebraic structure that
  we need for program analysis. I write about this in
  {{< draftlink "Part 2: Combining Lattices" "02_spa_agda_combining_lattices" >}}.

* __The Fixed-Point Algorithm__: to analyze a program, we use information
  that we already know to compute additional information. For instance,
  we might use the fact that `1` is positive to compute the fact that
  `1+1` is positive as well. Using that information, we can determine the
  sign of `(1+1)+1`, and so on. In practice, this is often done by calling
  some kind of "analyze" function over and over, each time getting closer to an
  accurate characterization of the program's behavior. When the output of "analyze"
  stops changing, we know we've found as much as we can find, and stop.

  What does it mean for the output to stop changing? Roughly, that's when
  the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
  this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
  To enable computing fixed points, we focus on a specific kind of lattices:
  those with a _fixed height_. I talk about what this means in
  {{< draftlink "Part 3: Lattices of Finite Height" "03_spa_agda_fixed_height" >}}.

  Even if we restrict our attention to lattices of fixed height,
  not all functions have fixed points; however, certain types of functions on
  lattices always do. The _fixed-point algorithm_ is a way to compute these
  points, and we will use this to drive our analyses. I talk
  about this in {{< draftlink "Part 4: The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}.

* __Correctness__: putting together the work on lattices and the fixed-point
  algorithm, we can implement a static program analyzer in Agda. However,
  it's not hard to write an "analyze" function that has a fixed point but
  produces an incorrect result. Thus, the next step is to prove that the results
  of our analyzer accurately describe the program in question.

  The interesting aspect of this step is that our program analyzer works
  on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
  which are a relatively compiler-centric representation of programs. On the
  other hand, what the language _actually does_ is defined by its
  [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
  which is not at all compiler-centric. We need to connect these two, showing
  that the CFGs we produce "make sense" for our language, and that given
  CFGs that make sense, our analysis produces results that match the language's
  execution. To do so, I write about the language and its semantics
  in {{< draftlink "Part 5: Our Programming Language" "05_spa_agda_semantics" >}},
  then about building control flow graphs for the language in
  {{< draftlink "Part 6: Control Flow Graphs" "06_spa_agda_cfg" >}}.
  I then write about combining these two representations in
  {{< draftlink "Part 7: Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}.


### Navigation
Here are the posts that I’ve written so far for this series:

* {{< draftlink "Lattices" "01_spa_agda_lattices" >}}
* {{< draftlink "Combining Lattices" "02_spa_agda_combining_lattices" >}}
* {{< draftlink "Lattices of Finite Height" "03_spa_agda_fixed_height" >}}
* {{< draftlink "The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}
* {{< draftlink "Our Programming Language" "05_spa_agda_semantics" >}}
* {{< draftlink "Control Flow Graphs" "06_spa_agda_cfg" >}}
* {{< draftlink "Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}
* {{< draftlink "Forward Analysis" "08_spa_forward" >}}
* {{< draftlink "Verifying the Forward Analysis" "09_spa_verified_forward" >}}
-												Add first section of Agda program analysis article

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-03-14 22:25:17 -07:00
+								---
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
+								title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
-												Add first section of Agda program analysis article

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-03-14 22:25:17 -07:00
+								series: "Static Program Analysis in Agda"
-												Move original 'monotone function' text into new post and heavily rework it

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 18:28:44 -07:00
+								description: "In this post, I give a top-level overview of my work on formally verified static analyses"
-												Publish introducton to SPA in Agda

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-07-06 17:39:03 -07:00
+								date: 2024-07-06T17:37:42-07:00
-												Add tags to Agda articles

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-20 21:46:14 -07:00
+								tags: ["Agda", "Programming Languages"]
-												Add first section of Agda program analysis article

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-03-14 22:25:17 -07:00
+								---
 								Some years ago, when the Programming Languages research group at Oregon State
 								University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
+								lecture notes came up. The group didn't end up reading the lecture notes,
-												Add first section of Agda program analysis article

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-03-14 22:25:17 -07:00
+								but I did. As I was going through them, I noticed that they were quite rigorous:
 								the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
 								and the subsequent analyses -- and the descriptions thereof -- are quite precise.
 								When I went to implement the algorithms in the textbook, I realized that just
 								writing them down would not be enough. After all, the textbook also proves
 								several properties of the lattice-based analyses, which would be lost in
 								translation if I were to just write C++ or Haskell.
 								At the same time, I noticed that lots of recent papers in programming language
 								theory were formalizing their results in
 								[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
 								Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
 								[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
 								[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
+								Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
-												Add first section of Agda program analysis article

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-03-14 22:25:17 -07:00
+								in Agda.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
+								In all, I built a framework for static analyses, based on a tool
 								called _motone functions_. This framework can be used to implement and
 								reason about many different analyses (currently only a certain class called
 								_forward analyses_, but that's not hard limitation). Recently, I've proven
 								the _correctness_ of the algorithms my framework produces. Having reached
 								this milestone, I'd like to pause and talk about what I've done.
 								In subsequent posts in this series, will describe what I have so far.
 								It's not perfect, and some work is yet to be done; however, getting to
 								this point was no joke, and I think it's worth discussing. In all,
-												Slightly tweak wording and front matter in Agda SPA posts

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-30 20:02:49 -07:00
+								I'd like to cover the following major topics, spending a couple of posts on each:
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
 								* __Lattices__: the analyses I'm reasoning about use an algebraic structure
 								  called a _lattice_. This structure has certain properties that make it
 								  amenable to describing degrees of "knowledge" about a program. In
 								  lattice-based static program analysis, the various elements of the
 								  lattice represent different facts or properties that we know about the
 								  program in question; operations on the lattice help us combine these facts
-												Publish introducton to SPA in Agda

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-07-06 17:39:03 -07:00
+								  and reason about them. I write about this in {{< draftlink "Part 1: Lattices" "01_spa_agda_lattices" >}}.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
 								  Interestingly, lattices can be made by combining other lattices in certain
 								  ways. We can therefore use simpler lattices as building blocks to create
 								  more complex ones, all while preserving the algebraic structure that
-												Slightly tweak wording and front matter in Agda SPA posts

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-30 20:02:49 -07:00
+								  we need for program analysis. I write about this in
-												Publish introducton to SPA in Agda

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-07-06 17:39:03 -07:00
+								  {{< draftlink "Part 2: Combining Lattices" "02_spa_agda_combining_lattices" >}}.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
 								* __The Fixed-Point Algorithm__: to analyze a program, we use information
 								  that we already know to compute additional information. For instance,
 								  we might use the fact that `1` is positive to compute the fact that
 								  `1+1` is positive as well. Using that information, we can determine the
 								  sign of `(1+1)+1`, and so on. In practice, this is often done by calling
 								  some kind of "analyze" function over and over, each time getting closer to an
 								  accurate characterization of the program's behavior. When the output of "analyze"
 								  stops changing, we know we've found as much as we can find, and stop.
 								  What does it mean for the output to stop changing? Roughly, that's when
 								  the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
 								  this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
-												Publish introducton to SPA in Agda

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-07-06 17:39:03 -07:00
+								  To enable computing fixed points, we focus on a specific kind of lattices:
 								  those with a _fixed height_. I talk about what this means in
 								  {{< draftlink "Part 3: Lattices of Finite Height" "03_spa_agda_fixed_height" >}}.
 								  Even if we restrict our attention to lattices of fixed height,
 								  not all functions have fixed points; however, certain types of functions on
 								  lattices always do. The _fixed-point algorithm_ is a way to compute these
 								  points, and we will use this to drive our analyses. I talk
 								  about this in {{< draftlink "Part 4: The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
 								* __Correctness__: putting together the work on lattices and the fixed-point
 								  algorithm, we can implement a static program analyzer in Agda. However,
 								  it's not hard to write an "analyze" function that has a fixed point but
 								  produces an incorrect result. Thus, the next step is to prove that the results
 								  of our analyzer accurately describe the program in question.
 								  The interesting aspect of this step is that our program analyzer works
 								  on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
 								  which are a relatively compiler-centric representation of programs. On the
 								  other hand, what the language _actually does_ is defined by its
 								  [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
 								  which is not at all compiler-centric. We need to connect these two, showing
 								  that the CFGs we produce "make sense" for our language, and that given
 								  CFGs that make sense, our analysis produces results that match the language's
-												Reference more future posts in SPA intro

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-08-18 18:05:55 -07:00
+								  execution. To do so, I write about the language and its semantics
 								  in {{< draftlink "Part 5: Our Programming Language" "05_spa_agda_semantics" >}},
 								  then about building control flow graphs for the language in
 								  {{< draftlink "Part 6: Control Flow Graphs" "06_spa_agda_cfg" >}}.
 								  I then write about combining these two representations in
 								  {{< draftlink "Part 7: Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}.
-												Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-12 13:49:21 -07:00
-												Start on the navigation links in Agda SPA intro

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-05-30 20:52:33 -07:00
 								### Navigation
 								Here are the posts that I’ve written so far for this series:
-												Publish introducton to SPA in Agda

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-07-06 17:39:03 -07:00
+								* {{< draftlink "Lattices" "01_spa_agda_lattices" >}}
 								* {{< draftlink "Combining Lattices" "02_spa_agda_combining_lattices" >}}
 								* {{< draftlink "Lattices of Finite Height" "03_spa_agda_fixed_height" >}}
-												Make sure all links are consistent

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-08-09 06:48:26 -07:00
+								* {{< draftlink "The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}
-												Reference more future posts in SPA intro

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-08-18 18:05:55 -07:00
+								* {{< draftlink "Our Programming Language" "05_spa_agda_semantics" >}}
-												Start working on the control flow graphs post

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-11-13 19:57:08 -08:00
+								* {{< draftlink "Control Flow Graphs" "06_spa_agda_cfg" >}}
 								* {{< draftlink "Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}
-												Add a draft post on forward analysis

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>

											
										
										
											2024-12-01 22:16:02 -08:00
+								* {{< draftlink "Forward Analysis" "08_spa_forward" >}}
 								* {{< draftlink "Verifying the Forward Analysis" "09_spa_verified_forward" >}}