Write a high-level introduction for the SPA series

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-05-12 13:49:21 -07:00
parent c6e2ecb996
commit 5873c1ca96
1 changed files with 65 additions and 8 deletions
--- a/content/blog/00_spa_agda.md
+++ b/content/blog/00_spa_agda.md
@@ -1,13 +1,13 @@
 ---
-title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
+title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
 series: "Static Program Analysis in Agda"
-date: 2024-03-11T14:23:03-07:00
+date: 2024-04-12T14:23:03-07:00
 draft: true
 ---

 Some years ago, when the Programming Languages research group at Oregon State
 University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
-lecture notes came up. The group didn't end up reading the lecture notes, 
+lecture notes came up. The group didn't end up reading the lecture notes,
 but I did. As I was going through them, I noticed that they were quite rigorous:
 the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
 and the subsequent analyses -- and the descriptions thereof -- are quite precise.
@@ -22,13 +22,70 @@ theory were formalizing their results in
 Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
 [with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
 [types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
-Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_
+Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
 in Agda.

-In this post (and subsequent follow-ups), I hope to describe what I have so far.
-Although not everything works, formalizing fixed-point algorithms (I'll get back
-to this) and getting even the most straightforward analysis working was no
-joke, and I think is worthy of discussion. So, here goes.
+In all, I built a framework for static analyses, based on a tool
+called _motone functions_. This framework can be used to implement and
+reason about many different analyses (currently only a certain class called
+_forward analyses_, but that's not hard limitation). Recently, I've proven
+the _correctness_ of the algorithms my framework produces. Having reached
+this milestone, I'd like to pause and talk about what I've done.
+
+In subsequent posts in this series, will describe what I have so far.
+It's not perfect, and some work is yet to be done; however, getting to
+this point was no joke, and I think it's worth discussing. In all,
+I envision three major topics to cover, each of which is likely going to make
+for a post or two:
+
+* __Lattices__: the analyses I'm reasoning about use an algebraic structure
+  called a _lattice_. This structure has certain properties that make it
+  amenable to describing degrees of "knowledge" about a program. In
+  lattice-based static program analysis, the various elements of the
+  lattice represent different facts or properties that we know about the
+  program in question; operations on the lattice help us combine these facts
+  and reason about them.
+
+  Interestingly, lattices can be made by combining other lattices in certain
+  ways. We can therefore use simpler lattices as building blocks to create
+  more complex ones, all while preserving the algebraic structure that
+  we need for program analysis.
+
+* __The Fixed-Point Algorithm__: to analyze a program, we use information
+  that we already know to compute additional information. For instance,
+  we might use the fact that `1` is positive to compute the fact that
+  `1+1` is positive as well. Using that information, we can determine the
+  sign of `(1+1)+1`, and so on. In practice, this is often done by calling
+  some kind of "analyze" function over and over, each time getting closer to an
+  accurate characterization of the program's behavior. When the output of "analyze"
+  stops changing, we know we've found as much as we can find, and stop.
+
+  What does it mean for the output to stop changing? Roughly, that's when
+  the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
+  this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
+  Crucially, not all functions have fixed points; however, certain types of
+  functions on lattices do. The fixed-point algorithm is a way to compute these
+  points, and we will use this to drive our analyses.
+
+* __Correctness__: putting together the work on lattices and the fixed-point
+  algorithm, we can implement a static program analyzer in Agda. However,
+  it's not hard to write an "analyze" function that has a fixed point but
+  produces an incorrect result. Thus, the next step is to prove that the results
+  of our analyzer accurately describe the program in question.
+
+  The interesting aspect of this step is that our program analyzer works
+  on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
+  which are a relatively compiler-centric representation of programs. On the
+  other hand, what the language _actually does_ is defined by its
+  [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
+  which is not at all compiler-centric. We need to connect these two, showing
+  that the CFGs we produce "make sense" for our language, and that given
+  CFGs that make sense, our analysis produces results that match the language's
+  execution.
+
+{{< todo >}}
+Once the posts are ready, link them here to add some kind of navigation.
+{{< /todo >}}

 ### Monotone Frameworks