Write a high-level introduction for the SPA series
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
c6e2ecb996
commit
5873c1ca96
|
@ -1,13 +1,13 @@
|
||||||
---
|
---
|
||||||
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
|
||||||
series: "Static Program Analysis in Agda"
|
series: "Static Program Analysis in Agda"
|
||||||
date: 2024-03-11T14:23:03-07:00
|
date: 2024-04-12T14:23:03-07:00
|
||||||
draft: true
|
draft: true
|
||||||
---
|
---
|
||||||
|
|
||||||
Some years ago, when the Programming Languages research group at Oregon State
|
Some years ago, when the Programming Languages research group at Oregon State
|
||||||
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
||||||
lecture notes came up. The group didn't end up reading the lecture notes,
|
lecture notes came up. The group didn't end up reading the lecture notes,
|
||||||
but I did. As I was going through them, I noticed that they were quite rigorous:
|
but I did. As I was going through them, I noticed that they were quite rigorous:
|
||||||
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
||||||
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
||||||
|
@ -22,13 +22,70 @@ theory were formalizing their results in
|
||||||
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
||||||
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
||||||
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
||||||
Thus began my journey to formalize the (first few chapters of) _Static Program Analysis_
|
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
|
||||||
in Agda.
|
in Agda.
|
||||||
|
|
||||||
In this post (and subsequent follow-ups), I hope to describe what I have so far.
|
In all, I built a framework for static analyses, based on a tool
|
||||||
Although not everything works, formalizing fixed-point algorithms (I'll get back
|
called _motone functions_. This framework can be used to implement and
|
||||||
to this) and getting even the most straightforward analysis working was no
|
reason about many different analyses (currently only a certain class called
|
||||||
joke, and I think is worthy of discussion. So, here goes.
|
_forward analyses_, but that's not hard limitation). Recently, I've proven
|
||||||
|
the _correctness_ of the algorithms my framework produces. Having reached
|
||||||
|
this milestone, I'd like to pause and talk about what I've done.
|
||||||
|
|
||||||
|
In subsequent posts in this series, will describe what I have so far.
|
||||||
|
It's not perfect, and some work is yet to be done; however, getting to
|
||||||
|
this point was no joke, and I think it's worth discussing. In all,
|
||||||
|
I envision three major topics to cover, each of which is likely going to make
|
||||||
|
for a post or two:
|
||||||
|
|
||||||
|
* __Lattices__: the analyses I'm reasoning about use an algebraic structure
|
||||||
|
called a _lattice_. This structure has certain properties that make it
|
||||||
|
amenable to describing degrees of "knowledge" about a program. In
|
||||||
|
lattice-based static program analysis, the various elements of the
|
||||||
|
lattice represent different facts or properties that we know about the
|
||||||
|
program in question; operations on the lattice help us combine these facts
|
||||||
|
and reason about them.
|
||||||
|
|
||||||
|
Interestingly, lattices can be made by combining other lattices in certain
|
||||||
|
ways. We can therefore use simpler lattices as building blocks to create
|
||||||
|
more complex ones, all while preserving the algebraic structure that
|
||||||
|
we need for program analysis.
|
||||||
|
|
||||||
|
* __The Fixed-Point Algorithm__: to analyze a program, we use information
|
||||||
|
that we already know to compute additional information. For instance,
|
||||||
|
we might use the fact that `1` is positive to compute the fact that
|
||||||
|
`1+1` is positive as well. Using that information, we can determine the
|
||||||
|
sign of `(1+1)+1`, and so on. In practice, this is often done by calling
|
||||||
|
some kind of "analyze" function over and over, each time getting closer to an
|
||||||
|
accurate characterization of the program's behavior. When the output of "analyze"
|
||||||
|
stops changing, we know we've found as much as we can find, and stop.
|
||||||
|
|
||||||
|
What does it mean for the output to stop changing? Roughly, that's when
|
||||||
|
the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
|
||||||
|
this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
|
||||||
|
Crucially, not all functions have fixed points; however, certain types of
|
||||||
|
functions on lattices do. The fixed-point algorithm is a way to compute these
|
||||||
|
points, and we will use this to drive our analyses.
|
||||||
|
|
||||||
|
* __Correctness__: putting together the work on lattices and the fixed-point
|
||||||
|
algorithm, we can implement a static program analyzer in Agda. However,
|
||||||
|
it's not hard to write an "analyze" function that has a fixed point but
|
||||||
|
produces an incorrect result. Thus, the next step is to prove that the results
|
||||||
|
of our analyzer accurately describe the program in question.
|
||||||
|
|
||||||
|
The interesting aspect of this step is that our program analyzer works
|
||||||
|
on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
|
||||||
|
which are a relatively compiler-centric representation of programs. On the
|
||||||
|
other hand, what the language _actually does_ is defined by its
|
||||||
|
[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
|
||||||
|
which is not at all compiler-centric. We need to connect these two, showing
|
||||||
|
that the CFGs we produce "make sense" for our language, and that given
|
||||||
|
CFGs that make sense, our analysis produces results that match the language's
|
||||||
|
execution.
|
||||||
|
|
||||||
|
{{< todo >}}
|
||||||
|
Once the posts are ready, link them here to add some kind of navigation.
|
||||||
|
{{< /todo >}}
|
||||||
|
|
||||||
### Monotone Frameworks
|
### Monotone Frameworks
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user