Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-05-12 13:49:21 -07:00
parent c6e2ecb996
commit 5873c1ca96
1 changed files with 65 additions and 8 deletions
--- a/content/blog/00_spa_agda.md
+++ b/content/blog/00_spa_agda.md
@@ -1,13 +1,13 @@
 ---
-title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
+title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
 series: "Static Program Analysis in Agda"
-date: 2024-03-11T14:23:03-07:00
+date: 2024-04-12T14:23:03-07:00
 draft: true
 ---
 Some years ago, when the Programming Languages research group at Oregon State
 University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
-lecture notes came up. The group didn't end up reading the lecture notes, 
+lecture notes came up. The group didn't end up reading the lecture notes,
 but I did. As I was going through them, I noticed that they were quite rigorous:
 the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
 and the subsequent analyses -- and the descriptions thereof -- are quite precise.
@@ -22,13 +22,70 @@ theory were formalizing their results in
 Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
 [with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
 [types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
-Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_
+Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
 in Agda.
-In this post (and subsequent follow-ups), I hope to describe what I have so far.
+In all, I built a framework for static analyses, based on a tool
-Although not everything works, formalizing fixed-point algorithms (I'll get back
+called _motone functions_. This framework can be used to implement and
-to this) and getting even the most straightforward analysis working was no
+reason about many different analyses (currently only a certain class called
-joke, and I think is worthy of discussion. So, here goes.
+_forward analyses_, but that's not hard limitation). Recently, I've proven
 the _correctness_ of the algorithms my framework produces. Having reached
 this milestone, I'd like to pause and talk about what I've done.
 In subsequent posts in this series, will describe what I have so far.
 It's not perfect, and some work is yet to be done; however, getting to
 this point was no joke, and I think it's worth discussing. In all,
 I envision three major topics to cover, each of which is likely going to make
 for a post or two:
 * __Lattices__: the analyses I'm reasoning about use an algebraic structure
  called a _lattice_. This structure has certain properties that make it
  amenable to describing degrees of "knowledge" about a program. In
  lattice-based static program analysis, the various elements of the
  lattice represent different facts or properties that we know about the
  program in question; operations on the lattice help us combine these facts
  and reason about them.
  Interestingly, lattices can be made by combining other lattices in certain
  ways. We can therefore use simpler lattices as building blocks to create
  more complex ones, all while preserving the algebraic structure that
  we need for program analysis.
 * __The Fixed-Point Algorithm__: to analyze a program, we use information
  that we already know to compute additional information. For instance,
  we might use the fact that `1` is positive to compute the fact that
  `1+1` is positive as well. Using that information, we can determine the
  sign of `(1+1)+1`, and so on. In practice, this is often done by calling
  some kind of "analyze" function over and over, each time getting closer to an
  accurate characterization of the program's behavior. When the output of "analyze"
  stops changing, we know we've found as much as we can find, and stop.
  What does it mean for the output to stop changing? Roughly, that's when
  the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
  this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
  Crucially, not all functions have fixed points; however, certain types of
  functions on lattices do. The fixed-point algorithm is a way to compute these
  points, and we will use this to drive our analyses.
 * __Correctness__: putting together the work on lattices and the fixed-point
  algorithm, we can implement a static program analyzer in Agda. However,
  it's not hard to write an "analyze" function that has a fixed point but
  produces an incorrect result. Thus, the next step is to prove that the results
  of our analyzer accurately describe the program in question.
  The interesting aspect of this step is that our program analyzer works
  on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
  which are a relatively compiler-centric representation of programs. On the
  other hand, what the language _actually does_ is defined by its
  [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
  which is not at all compiler-centric. We need to connect these two, showing
  that the CFGs we produce "make sense" for our language, and that given
  CFGs that make sense, our analysis produces results that match the language's
  execution.
 {{< todo >}}
 Once the posts are ready, link them here to add some kind of navigation.
 {{< /todo >}}
 ### Monotone Frameworks