Edit and improve the polynomial search post
This commit is contained in:
parent
0ade4b2efb
commit
ca3345cb33
|
@ -5,40 +5,53 @@ draft: true
|
||||||
tags: ["Mathematics"]
|
tags: ["Mathematics"]
|
||||||
---
|
---
|
||||||
|
|
||||||
Suppose that you're trying to get from city A to city B, and then from city B
|
I read a really neat paper some time ago, and I've been wanting to write about
|
||||||
to city C. Also suppose that your trips are measured in one-hour intervals, and
|
it ever since. The paper is called [Algebras for Weighted Search](https://dl.acm.org/doi/pdf/10.1145/3473577),
|
||||||
that trips of equal duration are considered equivalent.
|
and it is a tad too deep to dive into in a blog article -- readers of ICFP are
|
||||||
Given possible routes from A to B, and then given more routes from B to C, what
|
rarely the target audience on this site. However, one particular insight I
|
||||||
are the possible routes from A to C you can build up?
|
gleaned from the paper merits additional discussion and demonstration. I'm
|
||||||
|
going to do that here.
|
||||||
|
|
||||||
We can try with an example. Maybe there are two routes from A to B that take
|
We can start with something concrete. Suppose that you're trying to get from
|
||||||
two hours each, and one "quick" trip that takes only an hour. On top of this,
|
city A to city B, and then from city B to city C. Also suppose that your
|
||||||
there's one three-hour trip from B to C, and one two-hour trip. Given these
|
trips are measured in one-hour intervals, and that trips of equal duration are
|
||||||
building blocks, the list of possible trips from A to C is as follows.
|
considered equivalent. Given possible routes from A to B, and then given more
|
||||||
|
routes from B to C, what are the possible routes from A to C you can build up?
|
||||||
|
|
||||||
{{< latex >}}
|
In many cases, starting with an example helps build intuition. Maybe there
|
||||||
\begin{aligned}
|
are two routes from A to B that take two hours each, and one "quick" trip
|
||||||
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\
|
that takes only an hour. On top of this, there's one three-hour trip from B
|
||||||
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\
|
to C, and one two-hour trip. Given these building blocks, the list of
|
||||||
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\
|
possible trips from A to C is as follows.
|
||||||
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\
|
|
||||||
\textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip}
|
|
||||||
\end{aligned}
|
|
||||||
{{< /latex >}}
|
|
||||||
|
|
||||||
Does this look a little bit familiar? We're combining every length of trips
|
1. Two two-hour trips from A to B, followed up by the three-hour trip from B to
|
||||||
of A to B with every length of trips from B to C, and then totaling them up.
|
C.
|
||||||
In other words, we're multiplying two binomials!
|
2. Two two-hour trips from A to B, followed by the shorter two-hour trip from B
|
||||||
|
to C.
|
||||||
|
3. One one-hour trip from A to B, followed by the three-hour trip from B to C.
|
||||||
|
4. One one-hour trip from A to B, followed by the shorter two-hour trip from B to C.
|
||||||
|
|
||||||
|
In the above, to figure out the various ways of getting from A to C, we had to
|
||||||
|
examine all pairings of A-to-B routes with B-to-C routes. But then, multiple
|
||||||
|
pairings end up having the same total length: the second and third bullet
|
||||||
|
points both describe trips that take four hours. Thus, to give
|
||||||
|
our final report, we need to "combine like terms" - add up the trips from
|
||||||
|
the two matching bullet points, ending up with total of three four-hour trips.
|
||||||
|
|
||||||
|
Does this feel a little bit familiar? To me, this bears a rather striking
|
||||||
|
resemblance to an operation we've seen in algebra class: we're multiplying
|
||||||
|
two binomials! Here's the corresponding multiplication:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
|
\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
In fact, they don't have to be binomials. We can represent any combination
|
It's not just binomials that correspond to our combining paths between cities.
|
||||||
of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents
|
We can represent any combination of trips of various lengths as a polynomial.
|
||||||
\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials
|
Each term \\(ax^n\\) represents \\(a\\) trips of length \\(n\\). As we just
|
||||||
corresponds to "sequencing" the trips they represent -- matching each trip in
|
saw, multiplying two polynomials corresponds to "sequencing" the trips they
|
||||||
one with each of the trips in the other, and totaling them up.
|
represent -- matching each trip in one with each of the trips in the other,
|
||||||
|
and totaling them up.
|
||||||
|
|
||||||
What about adding polynomials, what does that correspond to? The answer there
|
What about adding polynomials, what does that correspond to? The answer there
|
||||||
is actually quite simple: if two polynomials both represent (distinct) lists of
|
is actually quite simple: if two polynomials both represent (distinct) lists of
|
||||||
|
@ -46,10 +59,10 @@ trips from A to B, then adding them just combines the list. If I know one trip
|
||||||
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
|
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
|
||||||
then we can combine that knowledge (\\(x^2+x\\)).
|
then we can combine that knowledge (\\(x^2+x\\)).
|
||||||
|
|
||||||
Well, that's a neat little thing, and pretty quick to demonstrate, too. But
|
Well, that's a neat little thing. But we can push this observation a bit
|
||||||
we can push this observation a bit further. To generalize what we've already
|
further. To generalize what we've already seen, however, we'll need to
|
||||||
seen, however, we'll need to figure out "the bare minimum" of what we need to
|
figure out "the bare minimum" of what we need to make polynomial
|
||||||
make polynomial multiplication work as we'd expect.
|
multiplication work as we'd expect.
|
||||||
|
|
||||||
### Polynomials over Semirings
|
### Polynomials over Semirings
|
||||||
Let's watch what happens when we multiply two binomials, paying really close
|
Let's watch what happens when we multiply two binomials, paying really close
|
||||||
|
@ -73,11 +86,10 @@ front, and a \\(-x\\) is at the very back. We use the fact that addition is
|
||||||
_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
|
_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
|
||||||
rearrange the equation, grouping the \\(x\\) and its negation together. This
|
rearrange the equation, grouping the \\(x\\) and its negation together. This
|
||||||
gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
|
gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
|
||||||
that multiplication by zero gives zero. We didn't use it in this example,
|
that multiplication by zero gives zero. Another important property (though
|
||||||
but another important property we want is for multiplication to be associative,
|
we didn't use it here) is that multiplication has to be associative, too.
|
||||||
too.
|
|
||||||
|
|
||||||
So, what if we didn't use numbers, but rather anything _thing_ with two
|
So, what if we didn't use numbers, but rather any _thing_ with two
|
||||||
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
|
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
|
||||||
As long as these operations satisfy the properties we have used so far, we
|
As long as these operations satisfy the properties we have used so far, we
|
||||||
should be able to create polynomials using them, and do this same sort of
|
should be able to create polynomials using them, and do this same sort of
|
||||||
|
@ -197,7 +209,34 @@ them out gives:
|
||||||
And that's right; if it's possible to get from A to B in either two hours
|
And that's right; if it's possible to get from A to B in either two hours
|
||||||
or one hour, and then from B to C in either three hours or two hours, then
|
or one hour, and then from B to C in either three hours or two hours, then
|
||||||
it's possible to get from A to C in either five, four, or three hours. In a
|
it's possible to get from A to C in either five, four, or three hours. In a
|
||||||
way, polynomials like this give us _less_ information than our original ones
|
way, polynomials like this give us
|
||||||
|
{{< sidenote "right" "homomorphism-note" "less information than our original ones" >}}
|
||||||
|
In fact, we can construct a semiring homomorphism (kind of like a
|
||||||
|
<a href="https://en.wikipedia.org/wiki/Ring_homomorphism">ring homomorphism</a>,
|
||||||
|
but for semirings) from \(\mathbb{N}[x]\) to \(\mathbb{B}[x]\) as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\sum_{i=0}^n a_ix^i \mapsto \sum_{i=0}^n \text{clamp}(a_i)x^i
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Where the \(\text{clamp}\) function checks if its argument is non-zero.
|
||||||
|
In the case of city path search, \(\text{clamp}\) asks the questions
|
||||||
|
"are there any routes at all?".
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{clamp}(n) = \begin{cases}
|
||||||
|
\text{false} & n = 0 \\
|
||||||
|
\text{true} & n > 0
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We can't construct the inverse of the above homomorphism (a mapping
|
||||||
|
that would undo our clamping, and take polynomials in \(\mathbb{B}[x]\) to
|
||||||
|
\(\mathbb{N}[x]\)). This fact gives us a more "mathematical" confirmation
|
||||||
|
that we lost information, rather than gained it, but switching to
|
||||||
|
boolean polynomials: we can always recover a boolean polynomial from the
|
||||||
|
natural number one, but not the other way around.
|
||||||
|
{{< /sidenote >}}
|
||||||
(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
|
(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
|
||||||
we're just warming up - there are more interesting semirings for us to
|
we're just warming up - there are more interesting semirings for us to
|
||||||
consider!
|
consider!
|
||||||
|
@ -228,17 +267,17 @@ the letter \\(\\pi\\) to denote a path, this means the following equation:
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
|
{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
|
||||||
In fact, if you clicked through the
|
Actually, if you clicked through the
|
||||||
<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
|
<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
|
||||||
link earlier, you might be interested to know that paths as defined here
|
link earlier, you might be interested to know that paths as defined here
|
||||||
form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
|
form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
|
||||||
as a unit.
|
as a unit.
|
||||||
{{< /sidenote >}}
|
{{< /sidenote >}}
|
||||||
Paths alone, though, aren't enough for our polynomials; we're tracking
|
Paths alone, though, aren't enough for our polynomials; we're tracking
|
||||||
different _ways_ to get from one place to another. This is an excellent
|
different ways to get from one place to another. This is an excellent
|
||||||
use case for sets!
|
use case for sets!
|
||||||
|
|
||||||
Our next semiring will be that of _sets of paths_. Some elements
|
Our next semiring will be that of _sets of paths_. Some example elements
|
||||||
of this semiring are \\(\\varnothing\\), also known as the empty set,
|
of this semiring are \\(\\varnothing\\), also known as the empty set,
|
||||||
\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
|
\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
|
||||||
containing a path via the highway, and another path via the suburbs:
|
containing a path via the highway, and another path via the suburbs:
|
||||||
|
@ -248,7 +287,8 @@ containing a path via the highway, and another path via the suburbs:
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
So what are the addition and multiplication on sets of paths? Addition
|
So what are the addition and multiplication on sets of paths? Addition
|
||||||
is the easier one: it's just the union of sets:
|
is the easier one: it's just the union of sets (the "triangle equal sign"
|
||||||
|
symbol means "defined as"):
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
A + B \triangleq A \cup B
|
A + B \triangleq A \cup B
|
||||||
|
@ -283,7 +323,7 @@ A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \i
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
What's the multiplicative identity? Well, since multiplication concatenates
|
What's the multiplicative identity? Well, since multiplication concatenates
|
||||||
all the combination of paths from two sets, we could try making a set of
|
all the combinations of paths from two sets, we could try making a set of
|
||||||
elements that don't do anything when concatenating. Sound familiar? It should,
|
elements that don't do anything when concatenating. Sound familiar? It should,
|
||||||
that's \\(\\circ\\), the empty path element! We thus define our multiplicative
|
that's \\(\\circ\\), the empty path element! We thus define our multiplicative
|
||||||
identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity:
|
identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity:
|
||||||
|
|
Loading…
Reference in New Issue
Block a user