Add draft of first part of polynomials article

This commit is contained in:
Danila Fedorin 2022-10-23 17:27:30 -07:00
parent a785c71c5f
commit 2439a02dbb
1 changed files with 327 additions and 0 deletions

View File

@ -0,0 +1,327 @@
---
title: "Search as a Polynomial"
date: 2022-10-22T14:51:15-07:00
draft: true
tags: ["Mathematics"]
---
Suppose that you're trying to get from city A to city B, and then from city B
to city C. Also suppose that your trips are measured in one-hour intervals, and
that trips of equal duration are considered equivalent.
Given possible routes from A to B, and then given more routes from B to C, what
are the possible routes from A to C you can build up?
We can try with an example. Maybe there are two routes from A to B that take
two hours each, and one "quick" trip that takes only an hour. On top of this,
there's one three-hour trip from B to C, and one two-hour trip. Given these
building blocks, the list of possible trips from A to C is as follows.
{{< latex >}}
\begin{aligned}
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\
\text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\
\text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\
\textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip}
\end{aligned}
{{< /latex >}}
Does this look a little bit familiar? We're combining every length of trips
of A to B with every length of trips from B to C, and then totaling them up.
In other words, we're multiplying two binomials!
{{< latex >}}
\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
{{< /latex >}}
In fact, they don't have to be binomials. We can represent any combination
of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents
\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials
corresponds to "sequencing" the trips they represent -- matching each trip in
one with each of the trips in the other, and totaling them up.
What about adding polynomials, what does that correspond to? The answer there
is actually quite simple: if two polynomials both represent (distinct) lists of
trips from A to B, then adding them just combines the list. If I know one trip
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
then we can combine that knowledge (\\(x^2+x\\)).
Well, that's a neat little thing, and pretty quick to demonstrate, too. But
we can push this observation a bit further. To generalize what we've already
seen, however, we'll need to figure out "the bare minimum" of what we need to
make polynomial multiplication work as we'd expect.
### Polynomials over Semirings
Let's watch what happens when we multiply two binomials, paying really close
attention to the operations we're performing. The following (concrete)
example should do.
{{< latex >}}
\begin{aligned}
& (x+1)(1-x)\\
=\ & (x+1)1+(x+1)(-x)\\
=\ & x+1-x^2-x \\
=\ & x-x+1-x^2 \\
=\ & 1-x^2
\end{aligned}
{{< /latex >}}
The first thing we do is _distribute_ the multiplication over the addition, on
the left. We then do that again, on the right this time. After this, we finally
get some terms, but they aren't properly grouped together; an \\(x\\) is at the
front, and a \\(-x\\) is at the very back. We use the fact that addition is
_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
rearrange the equation, grouping the \\(x\\) and its negation together. This
gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
that multiplication by zero gives zero. We didn't use it in this example,
but another important property we want is for multiplication to be associative,
too.
So, what if we didn't use numbers, but rather anything _thing_ with two
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
As long as these operations satisfy the properties we have used so far, we
should be able to create polynomials using them, and do this same sort of
"combining paths" we did earlier. Before we get to that, let me just say
that "things with addition and multiplication that work in the way we
described" have an established name in math - they're called semirings.
A __semiring__ is a set equipped with two operations, one called
"multiplicative" (and thus carrying the symbol \\(\\times)\\) and one
called "additive" (and thus written as \\(+\\)). Both of these operations
need to have an "identity element". The identity element for multiplication
is usually
{{< sidenote "right" "written-as-note" "written as \(1\)," >}}
And I do mean "written as": a semiring need not be over numbers. We could
define one over <a href="https://en.wikipedia.org/wiki/Graph">graphs</a>,
sets, and many other things! Nevertheless, because most of us learn the
properties of addition and multiplication much earlier than we learn about
other more "esoteric" things, using numbers to stand for special elements
seems to help use intuition.
{{< /sidenote >}}
and the identity element for addition is written
as \\(0\\). Furthermore, a few equations hold. I'll present them in groups.
First, multiplication is associative and multiplying by \\(1\\) does nothing;
in mathematical terms, the set forms a [monoid](https://mathworld.wolfram.com/Monoid.html)
with multiplication and \\(1\\).
{{< latex >}}
\begin{array}{cl}
(a\times b)\times c = a\times(b\times c) & \text{(multiplication associative)}\\
1\times a = a = a \times 1 & \text{(1 is multiplicative identity)}\\
\end{array}
{{< /latex >}}
Similarly, addition is associative and adding \\(0\\) does nothing.
Addition must also be commutative; in other words, the set forms a
commutative monoid with addition and \\(0\\).
{{< latex >}}
\begin{array}{cl}
(a+b)+c = a+(b+c) & \text{(addition associative)}\\
0+a = a = a+0 & \text{(0 is additive identity)}\\
a+b = b+a & \text{(addition is commutative)}\\
\end{array}
{{< /latex >}}
Finally, a few equations determine how addition and multiplication interact.
{{< latex >}}
\begin{array}{cl}
0\times a = 0 = a \times 0 & \text{(annihilation)}\\
a\times(b+c) = a\times b + a\times c & \text{(left distribution)}\\
(a+b)\times c = a\times c + b\times c & \text{(right distribution)}\\
\end{array}
{{< /latex >}}
That's it, we've defined a semiring. First, notice that numbers do indeed
form a semiring; all the equations above should be quite familiar from algebra
class. When using polynomials with numbers to do our city path finding,
we end up tracking how many different ways there are to get from one place to
another in a particular number of hours. There are, however, other semirings
we can use that yield interesting results, even though we continue to add
and multiply polynomials.
One last thing before we look at other semirings: given a semiring \\(R\\),
the polynomials using that \\(R\\), and written in terms of the variable
\\(x\\), are denoted as \\(R[x]\\).
#### The Semiring of Booleans, \\(\\mathbb{B}\\)
Alright, it's time for our first non-number example. It will be a simple one,
though - booleans (that's right, `true` and `false` from your favorite
programming language!) form a semiring. In this case, addition is the
"or" operation (aka `||`), in which the result is true if either operand
is true, and false otherwise.
{{< latex >}}
\begin{array}{c}
\text{true} + b = \text{true}\\
b + \text{true} = \text{true}\\
\text{false} + \text{false} = \text{false}
\end{array}
{{< /latex >}}
For addition, the identity element -- our \\(0\\) -- is \\(\\text{false}\\).
Correspondingly, multiplication is the "and" operation (aka `&&`), in which the
result is false if either operand is false, and true otherwise.
{{< latex >}}
\begin{array}{c}
\text{false} \times b = \text{false}\\
b \times \text{false} = \text{false}\\
\text{true} \times \text{true} = \text{true}
\end{array}
{{< /latex >}}
For multiplication, the identity element -- the \\(1\\) -- is \\(\\text{true}\\).
It's not hard to see that _both_ operations are commutative - the first and
second equations for addition, for instance, can be combined to get
\\(\\text{true}+b=b+\\text{true}\\), and the third equation clearly shows
commutativity when both operands are false. The other properties are
easy enough to verify by simple case analysis (there are 8 cases to consider).
The set of booleans is usually denoted as \\(\\mathbb{B}\\), which means
polynomials using booleans are denoted by \\(\\mathbb{B}[x]\\).
Let's try some examples. We can't count how many ways there are to get from
A to B in a certain number of hours anymore: booleans aren't numbers!
Instead, what we _can_ do is track _whether or not_ there is a way to get
from A to B in a certain number of hours (call it \\(n\\)). If we can,
we write that as \\(\text{true}\ x^n = 1x^n = x^n\\). If we can't, we write
that as \\(\\text{false}\ x^n = 0x^n = 0\\). The polynomials corresponding
to our introductory problem are \\(x^2+x^1\\) and \\(x^3+x^2\\). Multiplying
them out gives:
{{< latex >}}
(x^2+x^1)(x^3+x^2) = x^5 + x^4 + x^4 + x^3 = x^5 + x^4 + x^2
{{< /latex >}}
And that's right; if it's possible to get from A to B in either two hours
or one hour, and then from B to C in either three hours or two hours, then
it's possible to get from A to C in either five, four, or three hours. In a
way, polynomials like this give us _less_ information than our original ones
(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
we're just warming up - there are more interesting semirings for us to
consider!
#### Polynomials over Sets of Paths, \\(\\mathcal{P}(\\Pi)\\)
Until now, we explicitly said that "all paths of the same length are
equivalent". If we're giving directions, though, we might benefit
from knowing not just that there _is_ a way, but what roads that
way is made up of!
To this end, we define the set of paths, \\(\\Pi\\). This set will consist
of the empty path (which we will denote \\(\\circ\\), why not?), street
names (e.g. \\(\\text{Mullholland Dr.}\\) or \\(\\text{Sunset Blvd.}\\)), and
concatenations of paths, written using \\(\\rightarrow\\). For instance,
a path that first takes us on \\(\\text{Highway}\\) and then on
\\(\\text{Exit 4b}\\) will be written as:
{{< latex >}}
\text{Highway}\rightarrow\text{Exit 4b}
{{< /latex >}}
Furthermore, it's not too much of a stretch to say that adding an empty path
to the front or the back of another path doesn't change it. If we use
the letter \\(\\pi\\) to denote a path, this means the following equation:
{{< latex >}}
\circ \rightarrow \pi = \pi = \pi \rightarrow \circ
{{< /latex >}}
{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
In fact, if you clicked through the
<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
link earlier, you might be interested to know that paths as defined here
form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
as a unit.
{{< /sidenote >}}
Paths alone, though, aren't enough for our polynomials; we're tracking
different _ways_ to get from one place to another. This is an excellent
use case for sets!
Our next semiring will be that of _sets of paths_. Some elements
of this semiring are \\(\\varnothing\\), also known as the empty set,
\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
containing a path via the highway, and another path via the suburbs:
{{< latex >}}
\{\text{Highway}\rightarrow\text{Exit 4b}, \text{Suburb Rd.}\}
{{< /latex >}}
So what are the addition and multiplication on sets of paths? Addition
is the easier one: it's just the union of sets:
{{< latex >}}
A + B \triangleq A \cup B
{{< /latex >}}
It's well known (and not hard to verify) that set union is commutative
and associative. The additive identity \\(0\\) is simply the empty set
\\(\\varnothing\\). Intuitively, adding "no paths" to another set of
paths doesn't add anything, and thus leaves that other set unchanged.
Multiplication is a little bit more interesting, and uses the path
concatenation operation we defined earlier. We will use this
operation to describe path sequencing; given two sets of paths,
\\(A\\) and \\(B\\), we'll create a new set of paths
consisting of each path from \\(A\\) concatenated with each
path from \\(B\\):
{{< latex >}}
A \times B = \{ a \rightarrow b\ |\ a \in A, b \in B \}
{{< /latex >}}
The fact that this definition of multiplication on sets is associative
relies on the associativity of path concatenation; if path concatenation
weren't associative, the second equality below would not hold.
{{< latex >}}
\begin{array}{rcl}
A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \in B, c \in C \} \\
& \stackrel{?}{=} & \{ (a \rightarrow b) \rightarrow c \ |\ a \in A, b \in B, c \in C \} \\
& = & (A \times B) \times C
\end{array}
{{< /latex >}}
What's the multiplicative identity? Well, since multiplication concatenates
all the combination of paths from two sets, we could try making a set of
elements that don't do anything when concatenating. Sound familiar? It should,
that's \\(\\circ\\), the empty path element! We thus define our multiplicative
identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity:
{{< latex >}}
\begin{gathered}
\{\circ\} \times A = \{ \circ \rightarrow a\ |\ a \rightarrow A \} = \{ a \ |\ a \in A \} = A \\
A \times \{\circ\}= \{ a\rightarrow \circ \ |\ a \rightarrow A \} = \{ a \ |\ a \in A \} = A
\end{gathered}
{{< /latex >}}
It's not too difficult to verify the annihilation and distribution laws for
sets of paths, either; I won't do that here, though. Finally, let's take
a look at an example. Like before, we'll try make one that corresponds to
our introductory description of paths from A to B and from B to C. Now we need
to be a little bit creative, and come up with names for all these different
roads between our hypothetical cities. Let's say that \\(\\text{Highway A}\\)
and \\(\\text{Highway B}\\) are the two paths from A to B that take two hours
each, and then \\(\\text{Shortcut}\\) is the path that takes one hour. As for
paths from B to C, let's just call them \\(\\text{Long}\\) for the three-hour
path, and \\(\\text{Short}\\) for the two-hour path. Our two polynomials
are then:
{{< latex >}}
\begin{array}{rcl}
P_1 & = & \{\text{Highway A}, \text{Highway B}\}x^2 + \{\text{Shortcut}\}x \\
P_2 & = & \{\text{Long}\}x^3 + \{\text{Short}\}x^2
\end{array}
{{< /latex >}}
Multiplying them gives:
{{< latex >}}
\begin{array}{rl}
& \{\text{Highway A} \rightarrow \text{Long}, \text{Highway B} \rightarrow \text{Long}\}x^5\\
+ & \{\text{Highway A} \rightarrow \text{Short}, \text{Highway B} \rightarrow \text{Short}, \text{Shortcut} \rightarrow \text{Long}\}x^4\\
+ & \{\text{Shortcut} \rightarrow \text{Short}\}x^3
\end{array}
{{< /latex >}}
This resulting polynomial gives us all the paths from city A to city C,
grouped by their length!