Edit and improve the polynomial search post

This commit is contained in:
Danila Fedorin 2022-10-26 18:52:24 -07:00
parent 0ade4b2efb
commit c7dc8f0105

View File

@ -5,40 +5,53 @@ draft: true
tags: ["Mathematics"]
---
Suppose that you're trying to get from city A to city B, and then from city B
to city C. Also suppose that your trips are measured in one-hour intervals, and
that trips of equal duration are considered equivalent.
Given possible routes from A to B, and then given more routes from B to C, what
are the possible routes from A to C you can build up?
I read a really neat paper some time ago, and I've been wanting to write about
it ever since. The paper is called [Algebras for Weighted Search](https://dl.acm.org/doi/pdf/10.1145/3473577),
and it is a tad too deep to dive into in a blog article -- readers of ICFP are
rarely the target audience on this site. However, one particular insight I
gleaned from the paper merits additional discussion and demonstration. I'm
going to do that here.
We can try with an example. Maybe there are two routes from A to B that take
two hours each, and one "quick" trip that takes only an hour. On top of this,
there's one three-hour trip from B to C, and one two-hour trip. Given these
building blocks, the list of possible trips from A to C is as follows.
We can with something concrete. Suppose that you're trying to get from city A
to city B, and then from city B to city C. Also suppose that your trips are
measured in one-hour intervals, and that trips of equal duration are
considered equivalent. Given possible routes from A to B, and then given more
routes from B to C, what are the possible routes from A to C you can build up?
{{< latex >}}
\begin{aligned}
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\
\textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip}
\end{aligned}
{{< /latex >}}
In many cases, starting with an example helps build intuition. Maybe there
are two routes from A to B that take two hours each, and one "quick" trip
that takes only an hour. On top of this, there's one three-hour trip from B
to C, and one two-hour trip. Given these building blocks, the list of
possible trips from A to C is as follows.
Does this look a little bit familiar? We're combining every length of trips
of A to B with every length of trips from B to C, and then totaling them up.
In other words, we're multiplying two binomials!
1. Two two-hour trips from A to B, followed up by the three-hour trip from B to
C.
2. Two two-hour trips from A to B, followed by the shorter two-hour trip from B
to C.
3. One one-hour trip from A to B, followed by the three-hour trip from B to C.
4. One one-hour trip from A to B, followed by the shorter two-hour trip from B to C.
In the above, to figure out the various ways of getting from A to C, we had to
examine all pairings of A-to-B routes with B-to-C routes. But then, multiple
pairings end up having the same total length: the second and third bullet
points both describe trips that take four hours. Thus, to give
our final report, we need to "combine like terms" - add up the trips from
the two matching bullet points, ending up with total of three four-hour trips.
Does this feel a little bit familiar? To me, this bears a rather striking
resemblance to an operation we've seen in algebra class: we're multiplying
two binomials! Here's the corresponding multiplication:
{{< latex >}}
\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
{{< /latex >}}
In fact, they don't have to be binomials. We can represent any combination
of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents
\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials
corresponds to "sequencing" the trips they represent -- matching each trip in
one with each of the trips in the other, and totaling them up.
It's not just binomials that correspond to our combining paths between cities.
We can represent any combination of trips of various lengths as a polynomial.
Each term \\(ax^n\\) represents \\(a\\) trips of length \\(n\\). As we just
saw, multiplying two polynomials corresponds to "sequencing" the trips they
represent -- matching each trip in one with each of the trips in the other,
and totaling them up.
What about adding polynomials, what does that correspond to? The answer there
is actually quite simple: if two polynomials both represent (distinct) lists of
@ -46,10 +59,10 @@ trips from A to B, then adding them just combines the list. If I know one trip
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
then we can combine that knowledge (\\(x^2+x\\)).
Well, that's a neat little thing, and pretty quick to demonstrate, too. But
we can push this observation a bit further. To generalize what we've already
seen, however, we'll need to figure out "the bare minimum" of what we need to
make polynomial multiplication work as we'd expect.
Well, that's a neat little thing. But we can push this observation a bit
further. To generalize what we've already seen, however, we'll need to
figure out "the bare minimum" of what we need to make polynomial
multiplication work as we'd expect.
### Polynomials over Semirings
Let's watch what happens when we multiply two binomials, paying really close
@ -73,11 +86,10 @@ front, and a \\(-x\\) is at the very back. We use the fact that addition is
_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
rearrange the equation, grouping the \\(x\\) and its negation together. This
gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
that multiplication by zero gives zero. We didn't use it in this example,
but another important property we want is for multiplication to be associative,
too.
that multiplication by zero gives zero. Another important property (though
we didn't use it here) is that multiplication has to be associative, too.
So, what if we didn't use numbers, but rather anything _thing_ with two
So, what if we didn't use numbers, but rather any _thing_ with two
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
As long as these operations satisfy the properties we have used so far, we
should be able to create polynomials using them, and do this same sort of
@ -197,7 +209,34 @@ them out gives:
And that's right; if it's possible to get from A to B in either two hours
or one hour, and then from B to C in either three hours or two hours, then
it's possible to get from A to C in either five, four, or three hours. In a
way, polynomials like this give us _less_ information than our original ones
way, polynomials like this give us
{{< sidenote "right" "homomorphism-note" "less information than our original ones" >}}
In fact, we can construct a semiring homomorphism (kind of like a
<a href="https://en.wikipedia.org/wiki/Ring_homomorphism">ring homomorphism</a>,
but for semirings) from \(\mathbb{N}[x]\) to \(\mathbb{B}[b]\) as follows:
{{< latex >}}
\sum_{i=0}^n a_ix^i \mapsto \sum_{i=0}^n \text{clamp}(a_i)x^i
{{< /latex >}}
Where the \(\text{clamp}\) function checks if its argument is non-zero.
In the case of city path search, \(\text{clamp}\) asks the questions
"are there any routes at all?".
{{< latex >}}
\text{clamp}(n) = \begin{cases}
\text{false} & n = 0 \\
\text{true} & n > 0
\end{cases}
{{< /latex >}}
We can't construct the inverse of the above homomorphism (a mapping
that would undo our clamping, and take polynomials in \(\mathbb{B}[x]\) to
\(\mathbb{N}[x]\)). This fact gives us a more "mathematical" confirmation
that we lost information, rather than gained it, but switching to
boolean polynomials: we can always recover a boolean polynomial from the
natural number one, but not the other way around.
{{< /sidenote >}}
(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
we're just warming up - there are more interesting semirings for us to
consider!
@ -228,17 +267,17 @@ the letter \\(\\pi\\) to denote a path, this means the following equation:
{{< /latex >}}
{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
In fact, if you clicked through the
Actually, if you clicked through the
<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
link earlier, you might be interested to know that paths as defined here
form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
as a unit.
{{< /sidenote >}}
Paths alone, though, aren't enough for our polynomials; we're tracking
different _ways_ to get from one place to another. This is an excellent
different ways to get from one place to another. This is an excellent
use case for sets!
Our next semiring will be that of _sets of paths_. Some elements
Our next semiring will be that of _sets of paths_. Some example elements
of this semiring are \\(\\varnothing\\), also known as the empty set,
\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
containing a path via the highway, and another path via the suburbs:
@ -248,7 +287,8 @@ containing a path via the highway, and another path via the suburbs:
{{< /latex >}}
So what are the addition and multiplication on sets of paths? Addition
is the easier one: it's just the union of sets:
is the easier one: it's just the union of sets (the "triangle equal sign"
symbol means "defined as"):
{{< latex >}}
A + B \triangleq A \cup B
@ -283,7 +323,7 @@ A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \i
{{< /latex >}}
What's the multiplicative identity? Well, since multiplication concatenates
all the combination of paths from two sets, we could try making a set of
all the combinations of paths from two sets, we could try making a set of
elements that don't do anything when concatenating. Sound familiar? It should,
that's \\(\\circ\\), the empty path element! We thus define our multiplicative
identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity: