Write a high-level introduction for the SPA series
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
c6e2ecb996
commit
5873c1ca96
@ -1,13 +1,13 @@
|
||||
---
|
||||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
||||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
|
||||
series: "Static Program Analysis in Agda"
|
||||
date: 2024-03-11T14:23:03-07:00
|
||||
date: 2024-04-12T14:23:03-07:00
|
||||
draft: true
|
||||
---
|
||||
|
||||
Some years ago, when the Programming Languages research group at Oregon State
|
||||
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
||||
lecture notes came up. The group didn't end up reading the lecture notes,
|
||||
lecture notes came up. The group didn't end up reading the lecture notes,
|
||||
but I did. As I was going through them, I noticed that they were quite rigorous:
|
||||
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
||||
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
||||
@ -22,13 +22,70 @@ theory were formalizing their results in
|
||||
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
||||
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
||||
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
||||
Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_
|
||||
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
|
||||
in Agda.
|
||||
|
||||
In this post (and subsequent follow-ups), I hope to describe what I have so far.
|
||||
Although not everything works, formalizing fixed-point algorithms (I'll get back
|
||||
to this) and getting even the most straightforward analysis working was no
|
||||
joke, and I think is worthy of discussion. So, here goes.
|
||||
In all, I built a framework for static analyses, based on a tool
|
||||
called _motone functions_. This framework can be used to implement and
|
||||
reason about many different analyses (currently only a certain class called
|
||||
_forward analyses_, but that's not hard limitation). Recently, I've proven
|
||||
the _correctness_ of the algorithms my framework produces. Having reached
|
||||
this milestone, I'd like to pause and talk about what I've done.
|
||||
|
||||
In subsequent posts in this series, will describe what I have so far.
|
||||
It's not perfect, and some work is yet to be done; however, getting to
|
||||
this point was no joke, and I think it's worth discussing. In all,
|
||||
I envision three major topics to cover, each of which is likely going to make
|
||||
for a post or two:
|
||||
|
||||
* __Lattices__: the analyses I'm reasoning about use an algebraic structure
|
||||
called a _lattice_. This structure has certain properties that make it
|
||||
amenable to describing degrees of "knowledge" about a program. In
|
||||
lattice-based static program analysis, the various elements of the
|
||||
lattice represent different facts or properties that we know about the
|
||||
program in question; operations on the lattice help us combine these facts
|
||||
and reason about them.
|
||||
|
||||
Interestingly, lattices can be made by combining other lattices in certain
|
||||
ways. We can therefore use simpler lattices as building blocks to create
|
||||
more complex ones, all while preserving the algebraic structure that
|
||||
we need for program analysis.
|
||||
|
||||
* __The Fixed-Point Algorithm__: to analyze a program, we use information
|
||||
that we already know to compute additional information. For instance,
|
||||
we might use the fact that `1` is positive to compute the fact that
|
||||
`1+1` is positive as well. Using that information, we can determine the
|
||||
sign of `(1+1)+1`, and so on. In practice, this is often done by calling
|
||||
some kind of "analyze" function over and over, each time getting closer to an
|
||||
accurate characterization of the program's behavior. When the output of "analyze"
|
||||
stops changing, we know we've found as much as we can find, and stop.
|
||||
|
||||
What does it mean for the output to stop changing? Roughly, that's when
|
||||
the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
|
||||
this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
|
||||
Crucially, not all functions have fixed points; however, certain types of
|
||||
functions on lattices do. The fixed-point algorithm is a way to compute these
|
||||
points, and we will use this to drive our analyses.
|
||||
|
||||
* __Correctness__: putting together the work on lattices and the fixed-point
|
||||
algorithm, we can implement a static program analyzer in Agda. However,
|
||||
it's not hard to write an "analyze" function that has a fixed point but
|
||||
produces an incorrect result. Thus, the next step is to prove that the results
|
||||
of our analyzer accurately describe the program in question.
|
||||
|
||||
The interesting aspect of this step is that our program analyzer works
|
||||
on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
|
||||
which are a relatively compiler-centric representation of programs. On the
|
||||
other hand, what the language _actually does_ is defined by its
|
||||
[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
|
||||
which is not at all compiler-centric. We need to connect these two, showing
|
||||
that the CFGs we produce "make sense" for our language, and that given
|
||||
CFGs that make sense, our analysis produces results that match the language's
|
||||
execution.
|
||||
|
||||
{{< todo >}}
|
||||
Once the posts are ready, link them here to add some kind of navigation.
|
||||
{{< /todo >}}
|
||||
|
||||
### Monotone Frameworks
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user