blog-static/content/blog/agda_is_pattern.md

170 lines
8.7 KiB
Markdown
Raw Permalink Normal View History

---
title: "The \"Is Something\" Pattern in Agda"
date: 2023-08-31T22:15:34-07:00
tags: ["Agda"]
description: "In this post, I talk about a pattern I've observed in the Agda standard library."
---
Agda is a functional programming language with a relatively Haskell-like syntax
and feature set, so coming into it, I relied on my past experiences with Haskell
to get things done. However, the languages are sufficiently different to leave
room for useful design patterns in Agda that can't be brought over from Haskell,
because they don't exist there. One such pattern will be the focus of this post;
it's relatively simple, but I came across it by reading the standard library code.
My hope is that by writing it down here, I can save someone the trouble of
recognizing it and understanding its purpose. The pattern is "unique" to Agda
(in the sense that it isn't present in Haskell) because it relies on dependent types.
In my head, I call this the `IsSomething` pattern. Before I introduce it, let
me try to provide some motivation. I should say that this may not be the
only motivation for this pattern; it's just how I arrived at seeing its value.
### Type Classes for Related Operations
Suppose you wanted to define a type class for "a type that has an associative
binary operation". In Haskell, this is the famous `Semigroup` class. Here's
a definition I lifted from the [Haskell docs](https://hackage.haskell.org/package/base-4.18.0.0/docs/src/GHC.Base.html#Semigroup):
```Haskell
class Semigroup a where
(<>) :: a -> a -> a
a <> b = sconcat (a :| [ b ])
```
It says that a type `a` is a semigroup if it has a binary operation, which Haskell
calls `(<>)`. The language isn't expressive enough to encode the associative
property of this binary operation, but we won't hold it against Haskell: not
every language needs dependent types or SMT-backed refinement types. If
we translated this definition into Agda (and encoded the associativity constraint),
we'd end up with something like this:
{{< codelines "Agda" "agda-issomething/example.agda" 9 13 >}}
So far, so good. Now, let's also encode a more specific sort of type-with-binary-operation:
one where the operation is associative as before, but also has an identity element.
In Haskell, we can write this as:
```Haskell
class Semigroup a => Monoid a where
mempty :: a
```
This brings in all the requirements of `Semigroup`, with one additional one:
an element `mempty`, which is intended to be the aforementioned identity element for `(<>)`.
Once again, we can't encode the "identity element" property; I say this only
to explain the lack of any additional code in the preceding snippet.
In Agda, there isn't really a special syntax for "superclass"; we just use a field.
The "transliterated" implementation is as follows:
{{< codelines "Agda" "agda-issomething/example.agda" 15 24 >}}
This code might require a little bit of explanation. Like I said, the base class
is brought in as a field, `semigroup`. Then, every field of `semigroup`
is also made available within `Monoid`, as well as to users of `Monoid`, by
using an `open public` directive. The subsequent fields mimic the Haskell
definition amended with proofs of identity.
We get our first sign of awkwardness here. We can't refer to the binary operation
very easily; it's nested inside of `semigroup`, and we have to access its fields
to get ahold of `(∙)`. It's not too bad at all -- it just cost us an extra line.
However, the bookkeeping of what-operation-is-where gets frustrating quickly.
I will demonstrate the frustrations in one final example. I will admit to it
being contrived: I am trying to avoid introducing too many definitions and concepts
just for the sake of a motivating case. Suppose you are trying to specify
a type in which the binary operation has _two_ properties (e.g. it's a monoid
_and_ something else). Since the only two type classes I have so far are
`Monoid` and `Semigroup`, I will use those; note that in this particular instance,
using both is a contrivance, since one contains the latter.
{{< codelines "Agda" "agda-issomething/example.agda" 26 32 >}}
However, there's a problem: nothing in the above definition ensures that the
binary operations of the two fields are the same! As far as Agda is concerned
(as one would quickly come to realize by trying a few proofs with the code),
the two operations are completely separate. One could perhaps add an equality
constraint:
{{< codelines "Agda" "agda-issomething/example.agda" 26 34 >}}
However, this will get tedious quickly. Proofs will need to leverage rewrites
(via the `rewrite` keyword, or via `cong`) to change one of the binary operations
into the other. As you build up more and more complex algebraic structures,
in which the various operations are related in nontrivial ways, you start to
look for other approaches. That's where the `IsSomething` pattern comes in.
### The `IsSomething` Pattern: Parameterizing By Operations
The pain point of the original approach is data flow. The way it's written,
data (operations, elements, etc.) flows from the fields of a record to the record
itself: `Monoid` has to _read_ the `(∙)` operation from `Semigroup`.
The more fields you add, the more reading and reconciliation you have to do.
It would be better if the data flowed the other direction: from `Monoid` to
`Semigroup`. `Monoid` could say, "here's a binary operation; it must satisfy
these constraints, in addition to having an identity element". To _provide_
the binary operation to a field, we use type application; this would look
something like this:
{{< codelines "Agda" "agda-issomething/example.agda" 42 42 >}}
Here's the part that's not possible in Haskell: we have a `record`, called `IsSemigroup`,
that's parameterized by a _value_ -- the binary operation! This new record
is quite similar to our original `Semigroup`, except that it doesn't need a field
for `(∙)`: it gets that from outside. Note the additional parameter in the
`record` header:
{{< codelines "Agda" "agda-issomething/example.agda" 37 38 >}}
We can define an `IsMonoid` similarly:
{{< codelines "Agda" "agda-issomething/example.agda" 40 47 >}}
We want to make an "is" version for each algebraic property; this way,
if we want to use "monoid" as part of some other structure, we can pass it
the required binary operation the same way we passed it to `IsSemigroup`.
Finally, the contrived motivating example from above becomes:
{{< codelines "Agda" "agda-issomething/example.agda" 49 55 >}}
Since we passed the same operation to both `IsMonoid` and `IsSemigroup`, we
know that we really do have a _single_ operation with _both_ properties,
no strange equality witnesses or anything necessary.
Of course, these new records are not quite equivalent to our original ones. They
need to be passed a binary operation; a "complete" package should include the
binary operation _in addition_ to its properties encoded as `IsSemigroup` or
`IsMonoid`. Such a complete package would be more-or-less equivalent to our
original `Semigroup` and `Monoid` instances. Here's what that would look like:
{{< codelines "Agda" "agda-issomething/example.agda" 57 66 >}}
Agda calls records that include both the operation and its `IsSomething` record
_bundles_ (see [`Algebra.Bundles`](https://agda.github.io/agda-stdlib/Algebra.Bundles.html), for example).
Notice that the bundles don't rely on other bundles to define properties; that
would lead right back to the "bottom-up" data flow in which a parent record has
to access the operations and values stored in its fields. Hower, bundles do
sometimes "contain" (via a definition, not a field) smaller bundles, in case,
for example, you need _only_ a semigroup, but you have a monoid.
### Bonus: Using Parameterized Modules to Avoid Repetitive Arguments
One annoying thing about our definitions above is that we had to accept our
binary operation, and sometimes the zero element, as an argument to each one,
and to thread it through to all the fields that require it. Agda has a nice
mechanism to help alleviate some of this repetition: [parameterized modules](https://agda.readthedocs.io/en/latest/language/module-system.html#parameterised-modules).
We can define a _whole module_ that accepts the binary operation as an argument;
it will be implicitly passed as an argument to all of the definitions within.
Thus, our entire `IsMonoid`, `IsSemigroup`, and `IsContrivedExample` code could
look like this:
{{< codelines "Agda" "agda-issomething/example.agda" 68 87 >}}
The more `IsSomething` records you declare, the more effective this trick becomes.
### Conclusion
That's all I have! The pattern I've described shows up all over the Agda
standard library; the example that made me come across it was
the [`Algebra.Structures` module](https://agda.github.io/agda-stdlib/Algebra.Structures.html).
I hope you find it useful.
Happy (dependently typed) programming!