diff --git a/content/blog/search_polynomials.md b/content/blog/search_polynomials.md index c6ec303..3af9a26 100644 --- a/content/blog/search_polynomials.md +++ b/content/blog/search_polynomials.md @@ -5,40 +5,53 @@ draft: true tags: ["Mathematics"] --- -Suppose that you're trying to get from city A to city B, and then from city B -to city C. Also suppose that your trips are measured in one-hour intervals, and -that trips of equal duration are considered equivalent. -Given possible routes from A to B, and then given more routes from B to C, what -are the possible routes from A to C you can build up? +I read a really neat paper some time ago, and I've been wanting to write about +it ever since. The paper is called [Algebras for Weighted Search](https://dl.acm.org/doi/pdf/10.1145/3473577), +and it is a tad too deep to dive into in a blog article -- readers of ICFP are +rarely the target audience on this site. However, one particular insight I +gleaned from the paper merits additional discussion and demonstration. I'm +going to do that here. -We can try with an example. Maybe there are two routes from A to B that take -two hours each, and one "quick" trip that takes only an hour. On top of this, -there's one three-hour trip from B to C, and one two-hour trip. Given these -building blocks, the list of possible trips from A to C is as follows. +We can start with something concrete. Suppose that you're trying to get from +city A to city B, and then from city B to city C. Also suppose that your +trips are measured in one-hour intervals, and that trips of equal duration are +considered equivalent. Given possible routes from A to B, and then given more +routes from B to C, what are the possible routes from A to C you can build up? -{{< latex >}} -\begin{aligned} - \text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\ - \text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\ - \text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\ - \text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\ - \textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip} -\end{aligned} -{{< /latex >}} +In many cases, starting with an example helps build intuition. Maybe there +are two routes from A to B that take two hours each, and one "quick" trip +that takes only an hour. On top of this, there's one three-hour trip from B +to C, and one two-hour trip. Given these building blocks, the list of +possible trips from A to C is as follows. -Does this look a little bit familiar? We're combining every length of trips -of A to B with every length of trips from B to C, and then totaling them up. -In other words, we're multiplying two binomials! +1. Two two-hour trips from A to B, followed up by the three-hour trip from B to +C. +2. Two two-hour trips from A to B, followed by the shorter two-hour trip from B +to C. +3. One one-hour trip from A to B, followed by the three-hour trip from B to C. +4. One one-hour trip from A to B, followed by the shorter two-hour trip from B to C. + +In the above, to figure out the various ways of getting from A to C, we had to +examine all pairings of A-to-B routes with B-to-C routes. But then, multiple +pairings end up having the same total length: the second and third bullet +points both describe trips that take four hours. Thus, to give +our final report, we need to "combine like terms" - add up the trips from +the two matching bullet points, ending up with total of three four-hour trips. + +Does this feel a little bit familiar? To me, this bears a rather striking +resemblance to an operation we've seen in algebra class: we're multiplying +two binomials! Here's the corresponding multiplication: {{< latex >}} \left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3} {{< /latex >}} -In fact, they don't have to be binomials. We can represent any combination -of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents -\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials -corresponds to "sequencing" the trips they represent -- matching each trip in -one with each of the trips in the other, and totaling them up. +It's not just binomials that correspond to our combining paths between cities. +We can represent any combination of trips of various lengths as a polynomial. +Each term \\(ax^n\\) represents \\(a\\) trips of length \\(n\\). As we just +saw, multiplying two polynomials corresponds to "sequencing" the trips they +represent -- matching each trip in one with each of the trips in the other, +and totaling them up. What about adding polynomials, what does that correspond to? The answer there is actually quite simple: if two polynomials both represent (distinct) lists of @@ -46,10 +59,10 @@ trips from A to B, then adding them just combines the list. If I know one trip that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)), then we can combine that knowledge (\\(x^2+x\\)). -Well, that's a neat little thing, and pretty quick to demonstrate, too. But -we can push this observation a bit further. To generalize what we've already -seen, however, we'll need to figure out "the bare minimum" of what we need to -make polynomial multiplication work as we'd expect. +Well, that's a neat little thing. But we can push this observation a bit +further. To generalize what we've already seen, however, we'll need to +figure out "the bare minimum" of what we need to make polynomial +multiplication work as we'd expect. ### Polynomials over Semirings Let's watch what happens when we multiply two binomials, paying really close @@ -73,11 +86,10 @@ front, and a \\(-x\\) is at the very back. We use the fact that addition is _commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to rearrange the equation, grouping the \\(x\\) and its negation together. This gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact -that multiplication by zero gives zero. We didn't use it in this example, -but another important property we want is for multiplication to be associative, -too. +that multiplication by zero gives zero. Another important property (though +we didn't use it here) is that multiplication has to be associative, too. -So, what if we didn't use numbers, but rather anything _thing_ with two +So, what if we didn't use numbers, but rather any _thing_ with two operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)? As long as these operations satisfy the properties we have used so far, we should be able to create polynomials using them, and do this same sort of @@ -197,7 +209,34 @@ them out gives: And that's right; if it's possible to get from A to B in either two hours or one hour, and then from B to C in either three hours or two hours, then it's possible to get from A to C in either five, four, or three hours. In a -way, polynomials like this give us _less_ information than our original ones +way, polynomials like this give us +{{< sidenote "right" "homomorphism-note" "less information than our original ones" >}} +In fact, we can construct a semiring homomorphism (kind of like a +ring homomorphism, +but for semirings) from \(\mathbb{N}[x]\) to \(\mathbb{B}[x]\) as follows: + +{{< latex >}} + \sum_{i=0}^n a_ix^i \mapsto \sum_{i=0}^n \text{clamp}(a_i)x^i +{{< /latex >}} + +Where the \(\text{clamp}\) function checks if its argument is non-zero. +In the case of city path search, \(\text{clamp}\) asks the questions +"are there any routes at all?". + +{{< latex >}} +\text{clamp}(n) = \begin{cases} + \text{false} & n = 0 \\ + \text{true} & n > 0 +\end{cases} +{{< /latex >}} + +We can't construct the inverse of the above homomorphism (a mapping +that would undo our clamping, and take polynomials in \(\mathbb{B}[x]\) to +\(\mathbb{N}[x]\)). This fact gives us a more "mathematical" confirmation +that we lost information, rather than gained it, but switching to +boolean polynomials: we can always recover a boolean polynomial from the +natural number one, but not the other way around. +{{< /sidenote >}} (which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However, we're just warming up - there are more interesting semirings for us to consider! @@ -228,17 +267,17 @@ the letter \\(\\pi\\) to denote a path, this means the following equation: {{< /latex >}} {{< sidenote "right" "paths-monoid-note" "So those are paths." >}} -In fact, if you clicked through the +Actually, if you clicked through the monoid link earlier, you might be interested to know that paths as defined here form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\) as a unit. {{< /sidenote >}} Paths alone, though, aren't enough for our polynomials; we're tracking -different _ways_ to get from one place to another. This is an excellent +different ways to get from one place to another. This is an excellent use case for sets! -Our next semiring will be that of _sets of paths_. Some elements +Our next semiring will be that of _sets of paths_. Some example elements of this semiring are \\(\\varnothing\\), also known as the empty set, \\(\\{\\circ\\}\\), the set containing only the empty path, and the set containing a path via the highway, and another path via the suburbs: @@ -248,7 +287,8 @@ containing a path via the highway, and another path via the suburbs: {{< /latex >}} So what are the addition and multiplication on sets of paths? Addition -is the easier one: it's just the union of sets: +is the easier one: it's just the union of sets (the "triangle equal sign" +symbol means "defined as"): {{< latex >}} A + B \triangleq A \cup B @@ -283,7 +323,7 @@ A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \i {{< /latex >}} What's the multiplicative identity? Well, since multiplication concatenates -all the combination of paths from two sets, we could try making a set of +all the combinations of paths from two sets, we could try making a set of elements that don't do anything when concatenating. Sound familiar? It should, that's \\(\\circ\\), the empty path element! We thus define our multiplicative identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity: