Edit and improve the polynomial search post
This commit is contained in:
parent
0ade4b2efb
commit
c7dc8f0105
|
@ -5,40 +5,53 @@ draft: true
|
|||
tags: ["Mathematics"]
|
||||
---
|
||||
|
||||
Suppose that you're trying to get from city A to city B, and then from city B
|
||||
to city C. Also suppose that your trips are measured in one-hour intervals, and
|
||||
that trips of equal duration are considered equivalent.
|
||||
Given possible routes from A to B, and then given more routes from B to C, what
|
||||
are the possible routes from A to C you can build up?
|
||||
I read a really neat paper some time ago, and I've been wanting to write about
|
||||
it ever since. The paper is called [Algebras for Weighted Search](https://dl.acm.org/doi/pdf/10.1145/3473577),
|
||||
and it is a tad too deep to dive into in a blog article -- readers of ICFP are
|
||||
rarely the target audience on this site. However, one particular insight I
|
||||
gleaned from the paper merits additional discussion and demonstration. I'm
|
||||
going to do that here.
|
||||
|
||||
We can try with an example. Maybe there are two routes from A to B that take
|
||||
two hours each, and one "quick" trip that takes only an hour. On top of this,
|
||||
there's one three-hour trip from B to C, and one two-hour trip. Given these
|
||||
building blocks, the list of possible trips from A to C is as follows.
|
||||
We can with something concrete. Suppose that you're trying to get from city A
|
||||
to city B, and then from city B to city C. Also suppose that your trips are
|
||||
measured in one-hour intervals, and that trips of equal duration are
|
||||
considered equivalent. Given possible routes from A to B, and then given more
|
||||
routes from B to C, what are the possible routes from A to C you can build up?
|
||||
|
||||
{{< latex >}}
|
||||
\begin{aligned}
|
||||
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\
|
||||
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\
|
||||
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\
|
||||
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\
|
||||
\textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip}
|
||||
\end{aligned}
|
||||
{{< /latex >}}
|
||||
In many cases, starting with an example helps build intuition. Maybe there
|
||||
are two routes from A to B that take two hours each, and one "quick" trip
|
||||
that takes only an hour. On top of this, there's one three-hour trip from B
|
||||
to C, and one two-hour trip. Given these building blocks, the list of
|
||||
possible trips from A to C is as follows.
|
||||
|
||||
Does this look a little bit familiar? We're combining every length of trips
|
||||
of A to B with every length of trips from B to C, and then totaling them up.
|
||||
In other words, we're multiplying two binomials!
|
||||
1. Two two-hour trips from A to B, followed up by the three-hour trip from B to
|
||||
C.
|
||||
2. Two two-hour trips from A to B, followed by the shorter two-hour trip from B
|
||||
to C.
|
||||
3. One one-hour trip from A to B, followed by the three-hour trip from B to C.
|
||||
4. One one-hour trip from A to B, followed by the shorter two-hour trip from B to C.
|
||||
|
||||
In the above, to figure out the various ways of getting from A to C, we had to
|
||||
examine all pairings of A-to-B routes with B-to-C routes. But then, multiple
|
||||
pairings end up having the same total length: the second and third bullet
|
||||
points both describe trips that take four hours. Thus, to give
|
||||
our final report, we need to "combine like terms" - add up the trips from
|
||||
the two matching bullet points, ending up with total of three four-hour trips.
|
||||
|
||||
Does this feel a little bit familiar? To me, this bears a rather striking
|
||||
resemblance to an operation we've seen in algebra class: we're multiplying
|
||||
two binomials! Here's the corresponding multiplication:
|
||||
|
||||
{{< latex >}}
|
||||
\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
|
||||
{{< /latex >}}
|
||||
|
||||
In fact, they don't have to be binomials. We can represent any combination
|
||||
of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents
|
||||
\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials
|
||||
corresponds to "sequencing" the trips they represent -- matching each trip in
|
||||
one with each of the trips in the other, and totaling them up.
|
||||
It's not just binomials that correspond to our combining paths between cities.
|
||||
We can represent any combination of trips of various lengths as a polynomial.
|
||||
Each term \\(ax^n\\) represents \\(a\\) trips of length \\(n\\). As we just
|
||||
saw, multiplying two polynomials corresponds to "sequencing" the trips they
|
||||
represent -- matching each trip in one with each of the trips in the other,
|
||||
and totaling them up.
|
||||
|
||||
What about adding polynomials, what does that correspond to? The answer there
|
||||
is actually quite simple: if two polynomials both represent (distinct) lists of
|
||||
|
@ -46,10 +59,10 @@ trips from A to B, then adding them just combines the list. If I know one trip
|
|||
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
|
||||
then we can combine that knowledge (\\(x^2+x\\)).
|
||||
|
||||
Well, that's a neat little thing, and pretty quick to demonstrate, too. But
|
||||
we can push this observation a bit further. To generalize what we've already
|
||||
seen, however, we'll need to figure out "the bare minimum" of what we need to
|
||||
make polynomial multiplication work as we'd expect.
|
||||
Well, that's a neat little thing. But we can push this observation a bit
|
||||
further. To generalize what we've already seen, however, we'll need to
|
||||
figure out "the bare minimum" of what we need to make polynomial
|
||||
multiplication work as we'd expect.
|
||||
|
||||
### Polynomials over Semirings
|
||||
Let's watch what happens when we multiply two binomials, paying really close
|
||||
|
@ -73,11 +86,10 @@ front, and a \\(-x\\) is at the very back. We use the fact that addition is
|
|||
_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
|
||||
rearrange the equation, grouping the \\(x\\) and its negation together. This
|
||||
gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
|
||||
that multiplication by zero gives zero. We didn't use it in this example,
|
||||
but another important property we want is for multiplication to be associative,
|
||||
too.
|
||||
that multiplication by zero gives zero. Another important property (though
|
||||
we didn't use it here) is that multiplication has to be associative, too.
|
||||
|
||||
So, what if we didn't use numbers, but rather anything _thing_ with two
|
||||
So, what if we didn't use numbers, but rather any _thing_ with two
|
||||
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
|
||||
As long as these operations satisfy the properties we have used so far, we
|
||||
should be able to create polynomials using them, and do this same sort of
|
||||
|
@ -197,7 +209,34 @@ them out gives:
|
|||
And that's right; if it's possible to get from A to B in either two hours
|
||||
or one hour, and then from B to C in either three hours or two hours, then
|
||||
it's possible to get from A to C in either five, four, or three hours. In a
|
||||
way, polynomials like this give us _less_ information than our original ones
|
||||
way, polynomials like this give us
|
||||
{{< sidenote "right" "homomorphism-note" "less information than our original ones" >}}
|
||||
In fact, we can construct a semiring homomorphism (kind of like a
|
||||
<a href="https://en.wikipedia.org/wiki/Ring_homomorphism">ring homomorphism</a>,
|
||||
but for semirings) from \(\mathbb{N}[x]\) to \(\mathbb{B}[b]\) as follows:
|
||||
|
||||
{{< latex >}}
|
||||
\sum_{i=0}^n a_ix^i \mapsto \sum_{i=0}^n \text{clamp}(a_i)x^i
|
||||
{{< /latex >}}
|
||||
|
||||
Where the \(\text{clamp}\) function checks if its argument is non-zero.
|
||||
In the case of city path search, \(\text{clamp}\) asks the questions
|
||||
"are there any routes at all?".
|
||||
|
||||
{{< latex >}}
|
||||
\text{clamp}(n) = \begin{cases}
|
||||
\text{false} & n = 0 \\
|
||||
\text{true} & n > 0
|
||||
\end{cases}
|
||||
{{< /latex >}}
|
||||
|
||||
We can't construct the inverse of the above homomorphism (a mapping
|
||||
that would undo our clamping, and take polynomials in \(\mathbb{B}[x]\) to
|
||||
\(\mathbb{N}[x]\)). This fact gives us a more "mathematical" confirmation
|
||||
that we lost information, rather than gained it, but switching to
|
||||
boolean polynomials: we can always recover a boolean polynomial from the
|
||||
natural number one, but not the other way around.
|
||||
{{< /sidenote >}}
|
||||
(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
|
||||
we're just warming up - there are more interesting semirings for us to
|
||||
consider!
|
||||
|
@ -228,17 +267,17 @@ the letter \\(\\pi\\) to denote a path, this means the following equation:
|
|||
{{< /latex >}}
|
||||
|
||||
{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
|
||||
In fact, if you clicked through the
|
||||
Actually, if you clicked through the
|
||||
<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
|
||||
link earlier, you might be interested to know that paths as defined here
|
||||
form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
|
||||
as a unit.
|
||||
{{< /sidenote >}}
|
||||
Paths alone, though, aren't enough for our polynomials; we're tracking
|
||||
different _ways_ to get from one place to another. This is an excellent
|
||||
different ways to get from one place to another. This is an excellent
|
||||
use case for sets!
|
||||
|
||||
Our next semiring will be that of _sets of paths_. Some elements
|
||||
Our next semiring will be that of _sets of paths_. Some example elements
|
||||
of this semiring are \\(\\varnothing\\), also known as the empty set,
|
||||
\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
|
||||
containing a path via the highway, and another path via the suburbs:
|
||||
|
@ -248,7 +287,8 @@ containing a path via the highway, and another path via the suburbs:
|
|||
{{< /latex >}}
|
||||
|
||||
So what are the addition and multiplication on sets of paths? Addition
|
||||
is the easier one: it's just the union of sets:
|
||||
is the easier one: it's just the union of sets (the "triangle equal sign"
|
||||
symbol means "defined as"):
|
||||
|
||||
{{< latex >}}
|
||||
A + B \triangleq A \cup B
|
||||
|
@ -283,7 +323,7 @@ A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \i
|
|||
{{< /latex >}}
|
||||
|
||||
What's the multiplicative identity? Well, since multiplication concatenates
|
||||
all the combination of paths from two sets, we could try making a set of
|
||||
all the combinations of paths from two sets, we could try making a set of
|
||||
elements that don't do anything when concatenating. Sound familiar? It should,
|
||||
that's \\(\\circ\\), the empty path element! We thus define our multiplicative
|
||||
identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity:
|
||||
|
|
Loading…
Reference in New Issue
Block a user