blog-static/content/blog/03_spa_agda_fixed_height.md
Danila Fedorin 388c23c376 Write more on finite height lattices
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-07-05 12:09:29 -07:00

15 KiB
Raw Blame History

title series description date draft tags
Implementing and Verifying "Static Program Analysis" in Agda, Part 3: Lattices of Finite Height Static Program Analysis in Agda In this post, I describe the class of finite-height lattices, and prove that lattices we've alread seen are in that class 2024-05-30T19:37:01-07:00 true
Agda
Programming Languages

In the previous post, I introduced the class of finite-height lattices: lattices where chains made from elements and the less-than operator < can only be so long. As a first example, [natural numbers form a lattice]({{< relref "01_spa_agda_lattices#natural-numbers" >}}), but they are not a finite-height lattice; the following chain can be made infinitely long:

{{< latex >}} 0 < 1 < 2 < ... {{< /latex >}}

There isn't a "biggest natural number"! On the other hand, we've seen that our [sign lattice]({{< relref "01_spa_agda_lattices#sign-lattice" >}}) has a finite height; the longest chain we can make is three elements long; I showed one such chain (there are many chains of three elements) in [the previous post]({{< relref "02_spa_agda_combining_lattices#sign-three-elements" >}}), but here it is again:

{{< latex >}} \bot < + < \top {{< /latex >}}

It's also true that the [Cartesian product lattice (L_1 \times L_2)]({{< relref "02_spa_agda_combining_lattices#the-cartesian-product-lattice" >}}) has a finite height, as long as L_1 and L_2 are themselves finite-height lattices. In the specific case where both L_1 and L_2 are the sign lattice ((L_1 = L_2 = \text{Sign} )) we can observe that the longest chains have five elements. The following is one example:

{{< latex >}} (\bot, \bot) < (\bot, +) < (\bot, \top) < (+, \top) < (\top, \top) {{< /latex >}} {#sign-prod-chain}

The fact that L_1 and L_2 are themselves finite-height lattices is important; if either one of them is not, we can easily construct an infinite chain of the products. If we allowed L_2 to be natural numbers, we'd end up with infinite chains like this one: {#product-both-finite-height}

{{< latex >}} (\bot, 0) < (\bot, 1) < (\bot, 2) < ... {{< /latex >}}

Another lattice that has a finite height under certain conditions is [the map lattice]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}}). The "under certain conditions" part is important; we can easily construct an infinite chain of map lattice elements in general:

{{< latex >}} { a : 1 } < { a : 1, b : 1 } < { a: 1, b: 1, c: 1 } < ... {{< /latex >}}

As long as we have infinite keys to choose from, we can always keep adding new keys to make bigger and bigger maps. But if we fix the keys in the map -- say, we use only a and b -- then suddenly our heights are once again fixed. In fact, for the two keys I just picked, one longest chain is remarkably similar to the product chain above.

{{< latex >}} {a: \bot, a: \bot} < {a: \bot, b: +} < {a: \bot, b: \top} < {a: +, b: \top} < {a: \top, b: \top} {{< /latex >}}

The class of finite-height lattices is important for static program analysis, because it ensures that out our analyses don't take infinite time. Though there's an intuitive connection ("finite lattices mean finite execution"), the details of why the former is needed for the latter are nuanced. We'll talk about them in a subsequent post.

In the meantime, let's dig deeper into the notion of finite height, and the Agda proofs of the properties I've introduced thus far.

Formalizing Finite Height

The formalization I settled on is quite similar to the informal description: a lattice has a finite height of length h if the longest chain of elements compared by (<) is exactly h. There's only a slight complication: we allow for equivalent-but-not-equal elements in lattices. For instance, for a map lattice, we don't care about the order of the keys: so long as two maps relate the same set of keys to the same respective values, we will consider them equal. This, however, is beyond the notion of Agda's propositional equality (_≡_). Thus, we we need to generalize the definition of a chain to support equivalences. I parameterize the Chain module in my code by an equivalence relation, as well as the comparison relation R, which we will set to < for our chains. The equivalence relation and R/< are expected to play together nicely (if a < b, and a is equivalent to c, then it should be the case that c < b).

{{< codelines "agda" "agda-spa/Chain.agda" 3 7 >}}

From there, the definition of the Chain data type is much like the definition of a vector, but indexed by the endpoints, and containing witnesses of R/< between its elements. The indexing allows for representing the type of chains between particular lattice elements, and serves to ensure concatenation and other operations don't merge disparate chains.

{{< codelines "agda" "agda-spa/Chain.agda" 19 21 >}}

In the done case, we create a single-element chain, which has no comparisons. In this case, the chain starts and stops at the same element (where "the same" is modulo our equivalence). The step case prepends a new comparison a1 < a2 to an existing chain; once again, we allow for the existing chain to start with a different-but-equivalent element a2'.

With that definition in hand, I define what it means for a type of chains between elements of the lattice A to have a maximum height; simply put, all chains must have length less than or equal to the maximum.

{{< codelines "agda" "agda-spa/Chain.agda" 38 39 >}}

Though Bounded specifies a bound on the length of chains, it doesn't specify the (lowest) bound. Specifically, if the chains can only have length three, they are bounded by both 3, 30, and 300. To claim a lowest bound, we need to show that a chain of that length actually exists (otherwise, we could take the previous natural number, and it would be a bound as well). Thus, I define the Height predicate to require that a chain of the desired height exists, and that this height bounds the length of all other chains.

{{< codelines "agda" "agda-spa/Chain.agda" 47 48 >}}

Finally, for a lattice to have a finite height, the type of chains formed by using its less-than operator needs to have that height (satisfy the Height h predicate). To avoid having to thread through the equivalence relation, congruence proof, and more, I define a specialized predicate for lattices specifically. I do so as a "method" in my IsLattice record.

{{< codelines "agda" "agda-spa/Lattice.agda" 153 180 "hl_lines = 27 28">}}

Thus, bringing the operators and other definitions of IsLattice into scope will also bring in the FixedHeight predicate.

Fixed Height of the "Above-Below" Lattice

We've already seen intuitive evidence that the sign lattice --- which is an instance of the ["above-below" lattice]({{< relref "01_spa_agda_lattices#the-above-below-lattice" >}}) --- has a fixed height. The reason is simple: we extended a set of incomparable elements with a single element that's greater, and a single element that's lower. We can't make chains out of incomparable elements (since we can't compare them using <); thus, we can only have one < from the new least element, and one < from the new greatest element.

The proof is a bit tedious, but not all that complicated. First, a few auxiliary helpers; feel free to read only the type signatures. They specify, respectively:

  1. That the bottom element \bot of the above-below lattice is less than any concrete value from the underlying set. For instance, in the sign lattice case, \bot < +.
  2. That \bot is the only element satisfying the first property; that is, any value strictly less than an element of the underlying set must be \bot.
  3. That the top element \top of the above-below lattice is greater than any concrete value of the underlying set. This is the dual of the first property.
  4. That, much like the bottom element is the only value strictly less than elements of the underlying set, the top element is the only value strictly greater.

{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 315 335 >}}

From there, we can construct an instance of the longest chain. Actually, there's a bit of a hang-up: what if the underlying set is empty? Concretely, what if there were no signs? Then we could only construct a chain with one comparison: \bot < \top. Instead of adding logic to conditionally specify the length, I simply require that the set is populated by requiring a witness

{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 85 85 >}}

I use this witness to construct the two-< chain.

{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 339 340 >}}

The proof that the length of two -- in terms of comparisons -- is the bound of all chains of AboveBelow elements requires systematically rejecting all longer chains. Informally, suppose you have a chain of three or more comparisons.

  1. If it starts with \top, you can't add any more elements since that's the greatest element (contradiction).
  2. If you start with an element of the underlying set, you could add another element, but it has to be the top element; after that, you can't add any more (contradiction).
  3. If you start with \bot, you could arrive at a chain of two comparisons, but you can't go beyond that (in three cases, each leading to contradictions).

{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 342 355 "hl_lines=8-14">}}

Thus, the above-below lattice has a length of two comparisons (or alternatively, three elements).

{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 357 358 >}}

And that's it.

Fixed Height of the Product Lattice

Now, for something less tedious. We saw above that for a product lattice to have a finite height, its constituent lattices must have a finite height. The proof was by contradiction (by constructing an infinitely long product chain given a single infinite lattice). As a result, we'll focus this section on products of two finite lattices A and B. Additionally, for the proofs in this section, I require element equivalence to be decidable.

{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 115 117 >}}

Let's think about how we might go about constructing the longest chain in a product lattice. Let's start with some arbitrary element p_1 = (a_1, b_1). We need to find another value that isn't equal to p_1, because we'rebuilding chains of the less-than operator (<), and not the less-than-or-equal operator (\leq). As a result, we need to change either the first component, the second component, or both. If we're building "to the right" (adding bigger elements), the new components would need to be bigger. Suppose then that we came up with a_2 and b_2, with a_1 < a_2 and b_1 < b_2. We could then create a length-one chain:

{{< latex >}} (a_1, b_1) < (a_2, b_2) {{< /latex >}}

That works, but we can construct an even longer chain by increasing only one element at a time:

{{< latex >}} (a_1, b_1) < (a_1, b_2) < (a_2, b_2) {{< /latex >}}

We can apply this logic every time; the conclusion is that when building up a chain, we need to increase one element at a time. Then, how many times can we increase an element? Well, if lattice A has a height of two (comparisons), then we can take its lowest element, and increase it twice. Similarly, if lattice B has a height of three, starting at its lowest element, we can increase it three times. In all, when building a chain of A × B, we can increase an element five times.

This gives us a recipe for constructing the longest chain in the product lattice: take the longest chains of A and B, and start with the product of their lowest elements. Then, increase the elements one at a time according to the chains. The simplest way to do that might be to increase by all elements of the A chain, and then by all of the elements of the B chain (or the other way around). That's the strategy I took when constructing the (\text{Sign} \times \text{Sign}) chain above.

To formalize this notion, a few lemmas. First, given two chains where one starts with the same element another ends with, we can combine them into one long chain.

{{< codelines "agda" "agda-spa/Chain.agda" 31 33 >}}

More interestingly, given a chain of comparisons in one lattice, we are able to lift it into a chain in another lattice by applying a function to each element. This function must be monotonic, because it must not be able to reverse a < b such that f(b) < f(a). Moreover, this function should be injective, because if f(a) = f(b), then a chain a < b might be collapsed into f(a) \not< f(a), changing its length. Finally, the function needs to produce equivalent outputs when giving equivalent inputs. The result is the following lemma:

{{< codelines "agda" "agda-spa/Lattice.agda" 196 217 >}}

Given this, and two lattices of finite height, we construct the full product chain by lifting the A chain into the product via a \mapsto (a, \bot), lifting the B chain into the product via b \mapsto (\top, b), and concatenating the results. This works because the first chain ends with (\top, \bot), and the second starts with it.

{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 177 179 >}}

This gets us the longest chain; what remains is to prove that this chain's length is the bound of all other changes. To do so, we need to work in the opposite direction; given a chain in the product lattice, we need to somehow reduce it to chains in lattices A and B, and leverage their finite height to complete the proof.

The key idea is that for every two consecutive elements in the product lattice chain, we know that at least one of their components must've increased. This increase had to come either from elements in lattice A or in lattice B. We can thus stick this increase into an A-chain or a B-chain, increasing its length. Since one of the chains grows with every consecutive pair, the number of consecutive pairs can't exceed the length of the A and B chains.

I implement this idea as an unzip function, which takes a product chain and produces two chains made from its increases. By the logic we've described, the length two chains has to bound the main one's. I give the signature below, and will put the implementation in a collapsible detail block. One last detail is that the need to decide which chain to grow --- and thus which element has increased --- is what introduces the need for decidable equality.

{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 158 158 >}}

{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 158 172 "" "(Click here for the implementation of unzip)" >}}

Having decomposed the product chain into constituent chains, we simply combine the facts that they have to be bounded by the height of the A and B lattices, as well as the fact that they bound the combined chain.

{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 174 183 "hl_lines = 8-9" >}}

This completes the proof!

{{< todo >}} The rest of this. {{< /todo >}}