Add draft of first part of polynomials article

2022-10-23 17:27:30 -07:00 · 2022-10-23 17:27:30 -07:00 · 2439a02dbb
commit 2439a02dbb
parent a785c71c5f
1 changed files with 327 additions and 0 deletions
--- a/content/blog/search_polynomials.md
+++ b/content/blog/search_polynomials.md
@ -0,0 +1,327 @@
+---
+title: "Search as a Polynomial"
+date: 2022-10-22T14:51:15-07:00
+draft: true
+tags: ["Mathematics"]
+---
+
+Suppose that you're trying to get from city A to city B, and then from city B
+to city C. Also suppose that your trips are measured in one-hour intervals, and
+that trips of equal duration are considered equivalent.
+Given possible routes from A to B, and then given more routes from B to C, what
+are the possible routes from A to C you can build up?
+
+We can try with an example. Maybe there are two routes from A to B that take
+two hours each, and one "quick" trip that takes only an hour. On top of this,
+there's one three-hour trip from B to C, and one two-hour trip. Given these
+building blocks, the list of possible trips from A to C is as follows.
+
+{{< latex >}}
+\begin{aligned}
+    \text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 3h\ \text{trip} = \text{two}\ 5h\ \text{trips}\\
+    \text{two}\ 2h\ \text{trips} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 4h\ \text{trips}\\
+    \text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 3h\ \text{trip} = \text{one}\ 4h\ \text{trips}\\
+    \text{one}\ 1h\ \text{trip} \rightarrow \text{one}\ 2h\ \text{trip} = \text{two}\ 3h\ \text{trips}\\
+    \textbf{total:}\ \text{two}\ 5h\ \text{trips}, \text{three}\ 4h\ \text{trips}, \text{one}\ 3h\ \text{trip}
+\end{aligned}
+{{< /latex >}}
+
+Does this look a little bit familiar? We're combining every length of trips
+of A to B with every length of trips from B to C, and then totaling them up.
+In other words, we're multiplying two binomials!
+
+{{< latex >}}
+\left(2x^2 + x\right)\left(x^3+x^2\right) = 2x^5 + 2x^4 + x^4 + x^3 = \underline{2x^5+3x^4+x^3}
+{{< /latex >}}
+
+In fact, they don't have to be binomials. We can represent any combination
+of trips of various lengths as a polynomial. Each term \\(ax^n\\) represents
+\\(a\\) trips of length \\(n\\). As we just saw, multiplying two polynomials
+corresponds to "sequencing" the trips they represent -- matching each trip in
+one with each of the trips in the other, and totaling them up.
+
+What about adding polynomials, what does that correspond to? The answer there
+is actually quite simple: if two polynomials both represent (distinct) lists of
+trips from A to B, then adding them just combines the list. If I know one trip
+that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
+then we can combine that knowledge (\\(x^2+x\\)).
+
+Well, that's a neat little thing, and pretty quick to demonstrate, too. But
+we can push this observation a bit further. To generalize what we've already
+seen, however, we'll need to figure out "the bare minimum" of what we need to
+make polynomial multiplication work as we'd expect.
+
+### Polynomials over Semirings
+Let's watch what happens when we multiply two binomials, paying really close
+attention to the operations we're performing. The following (concrete)
+example should do.
+
+{{< latex >}}
+\begin{aligned}
+    & (x+1)(1-x)\\
+    =\ & (x+1)1+(x+1)(-x)\\
+    =\ & x+1-x^2-x \\
+    =\ & x-x+1-x^2 \\
+    =\ & 1-x^2
+\end{aligned}
+{{< /latex >}}
+
+The first thing we do is _distribute_ the multiplication over the addition, on
+the left. We then do that again, on the right this time. After this, we finally
+get some terms, but they aren't properly grouped together; an \\(x\\) is at the
+front, and a \\(-x\\) is at the very back. We use the fact that addition is
+_commutative_ (\\(a+b=b+a\\)) and _associative_ (\\(a+(b+c)=(a+b)+c\\)) to
+rearrange the equation, grouping the \\(x\\) and its negation together. This
+gives us \\((1-1)x=0x=0\\). That last step is important: we've used the fact
+that multiplication by zero gives zero. We didn't use it in this example,
+but another important property we want is for multiplication to be associative,
+too.
+
+So, what if we didn't use numbers, but rather anything _thing_ with two
+operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
+As long as these operations satisfy the properties we have used so far, we
+should be able to create polynomials using them, and do this same sort of
+"combining paths" we did earlier. Before we get to that, let me just say
+that "things with addition and multiplication that work in the way we
+described" have an established name in math - they're called semirings.
+
+A __semiring__ is a set equipped with two operations, one called
+"multiplicative" (and thus carrying the symbol \\(\\times)\\) and one
+called "additive" (and thus written as \\(+\\)). Both of these operations
+need to have an "identity element". The identity element for multiplication
+is usually
+{{< sidenote "right" "written-as-note" "written as \(1\)," >}}
+And I do mean "written as": a semiring need not be over numbers. We could
+define one over <a href="https://en.wikipedia.org/wiki/Graph">graphs</a>,
+sets, and many other things! Nevertheless, because most of us learn the
+properties of addition and multiplication much earlier than we learn about
+other more "esoteric" things, using numbers to stand for special elements
+seems to help use intuition.
+{{< /sidenote >}}
+and the identity element for addition is written
+as \\(0\\). Furthermore, a few equations hold. I'll present them in groups.
+First, multiplication is associative and multiplying by \\(1\\) does nothing;
+in mathematical terms, the set forms a [monoid](https://mathworld.wolfram.com/Monoid.html)
+with multiplication and \\(1\\).
+{{< latex >}}
+\begin{array}{cl}
+    (a\times b)\times c = a\times(b\times c) & \text{(multiplication associative)}\\
+    1\times a = a = a \times 1 & \text{(1 is multiplicative identity)}\\
+\end{array}
+{{< /latex >}}
+
+Similarly, addition is associative and adding \\(0\\) does nothing.
+Addition must also be commutative; in other words, the set forms a
+commutative monoid with addition and \\(0\\).
+{{< latex >}}
+\begin{array}{cl}
+    (a+b)+c = a+(b+c) & \text{(addition associative)}\\
+    0+a = a = a+0 & \text{(0 is additive identity)}\\
+    a+b = b+a & \text{(addition is commutative)}\\
+\end{array}
+{{< /latex >}}
+
+Finally, a few equations determine how addition and multiplication interact.
+{{< latex >}}
+\begin{array}{cl}
+    0\times a = 0 = a \times 0 & \text{(annihilation)}\\
+    a\times(b+c) = a\times b + a\times c & \text{(left distribution)}\\
+    (a+b)\times c = a\times c + b\times c & \text{(right distribution)}\\
+\end{array}
+{{< /latex >}}
+
+That's it, we've defined a semiring. First, notice that numbers do indeed
+form a semiring; all the equations above should be quite familiar from algebra
+class. When using polynomials with numbers to do our city path finding,
+we end up tracking how many different ways there are to get from one place to
+another in a particular number of hours. There are, however, other semirings
+we can use that yield interesting results, even though we continue to add
+and multiply polynomials.
+
+One last thing before we look at other semirings: given a semiring \\(R\\),
+the polynomials using that \\(R\\), and written in terms of the variable
+\\(x\\), are denoted as \\(R[x]\\).
+
+
+#### The Semiring of Booleans, \\(\\mathbb{B}\\)
+Alright, it's time for our first non-number example. It will be a simple one,
+though - booleans (that's right, `true` and `false` from your favorite
+programming language!) form a semiring. In this case, addition is the
+"or" operation (aka `||`), in which the result is true if either operand
+is true, and false otherwise.
+
+{{< latex >}}
+\begin{array}{c}
+    \text{true} + b = \text{true}\\
+    b + \text{true} = \text{true}\\
+    \text{false} + \text{false} = \text{false}
+\end{array}
+{{< /latex >}}
+
+For addition, the identity element -- our \\(0\\) -- is \\(\\text{false}\\).
+
+Correspondingly, multiplication is the "and" operation (aka `&&`), in which the
+result is false if either operand is false, and true otherwise.
+
+{{< latex >}}
+\begin{array}{c}
+    \text{false} \times b = \text{false}\\
+    b \times \text{false} = \text{false}\\
+    \text{true} \times \text{true} = \text{true}
+\end{array}
+{{< /latex >}}
+
+For multiplication, the identity element -- the \\(1\\) -- is \\(\\text{true}\\).
+
+It's not hard to see that _both_ operations are commutative - the first and
+second equations for addition, for instance, can be combined to get
+\\(\\text{true}+b=b+\\text{true}\\), and the third equation clearly shows
+commutativity when both operands are false. The other properties are
+easy enough to verify by simple case analysis (there are 8 cases to consider).
+The set of booleans is usually denoted as \\(\\mathbb{B}\\), which means
+polynomials using booleans are denoted by \\(\\mathbb{B}[x]\\).
+
+Let's try some examples. We can't count how many ways there are to get from
+A to B in a certain number of hours anymore: booleans aren't numbers!
+Instead, what we _can_ do is track _whether or not_ there is a way to get
+from A to B in a certain number of hours (call it \\(n\\)). If we can,
+we write that as \\(\text{true}\ x^n = 1x^n = x^n\\). If we can't, we write
+that as \\(\\text{false}\ x^n = 0x^n = 0\\). The polynomials corresponding
+to our introductory problem are \\(x^2+x^1\\) and \\(x^3+x^2\\). Multiplying
+them out gives:
+
+{{< latex >}}
+(x^2+x^1)(x^3+x^2) = x^5 + x^4 + x^4 + x^3 = x^5 + x^4 + x^2
+{{< /latex >}}
+
+And that's right; if it's possible to get from A to B in either two hours
+or one hour, and then from B to C in either three hours or two hours, then
+it's possible to get from A to C in either five, four, or three hours. In a
+way, polynomials like this give us _less_ information than our original ones
+(which were \\(\\mathbb{N}[x]\\), polynomials over natural numbers \\(\\mathbb{N} = \\{ 0, 1, 2, ... \\}\\)), so it's unclear why we'd prefer them. However,
+we're just warming up - there are more interesting semirings for us to
+consider!
+
+#### Polynomials over Sets of Paths, \\(\\mathcal{P}(\\Pi)\\)
+Until now, we explicitly said that "all paths of the same length are
+equivalent". If we're giving directions, though, we might benefit
+from knowing not just that there _is_ a way, but what roads that
+way is made up of!
+
+To this end, we define the set of paths, \\(\\Pi\\). This set will consist
+of the empty path (which we will denote \\(\\circ\\), why not?), street
+names (e.g. \\(\\text{Mullholland Dr.}\\) or \\(\\text{Sunset Blvd.}\\)), and
+concatenations of paths, written using \\(\\rightarrow\\). For instance,
+a path that first takes us on \\(\\text{Highway}\\) and then on
+\\(\\text{Exit 4b}\\) will be written as:
+
+{{< latex >}}
+\text{Highway}\rightarrow\text{Exit 4b}
+{{< /latex >}}
+
+Furthermore, it's not too much of a stretch to say that adding an empty path
+to the front or the back of another path doesn't change it. If we use
+the letter \\(\\pi\\) to denote a path, this means the following equation:
+
+{{< latex >}}
+\circ \rightarrow \pi = \pi = \pi \rightarrow \circ
+{{< /latex >}}
+
+{{< sidenote "right" "paths-monoid-note" "So those are paths." >}}
+In fact, if you clicked through the
+<a href="https://mathworld.wolfram.com/Monoid.html">monoid</a>
+link earlier, you might be interested to know that paths as defined here
+form a monoid with concatenation \(\rightarrow\) and the empty path \(\circ\)
+as a unit.
+{{< /sidenote >}}
+Paths alone, though, aren't enough for our polynomials; we're tracking
+different _ways_ to get from one place to another. This is an excellent
+use case for sets!
+
+Our next semiring will be that of _sets of paths_. Some elements
+of this semiring are \\(\\varnothing\\), also known as the empty set,
+\\(\\{\\circ\\}\\), the set containing only the empty path, and the set
+containing a path via the highway, and another path via the suburbs:
+
+{{< latex >}}
+\{\text{Highway}\rightarrow\text{Exit 4b}, \text{Suburb Rd.}\}
+{{< /latex >}}
+
+So what are the addition and multiplication on sets of paths? Addition
+is the easier one: it's just the union of sets:
+
+{{< latex >}}
+A + B \triangleq A \cup B
+{{< /latex >}}
+
+It's well known (and not hard to verify) that set union is commutative
+and associative. The additive identity \\(0\\) is simply the empty set
+\\(\\varnothing\\). Intuitively, adding "no paths" to another set of
+paths doesn't add anything, and thus leaves that other set unchanged.
+
+Multiplication is a little bit more interesting, and uses the path
+concatenation operation we defined earlier. We will use this
+operation to describe path sequencing; given two sets of paths,
+\\(A\\) and \\(B\\), we'll create a new set of paths
+consisting of each path from \\(A\\) concatenated with each
+path from \\(B\\):
+
+{{< latex >}}
+A \times B = \{ a \rightarrow b\ |\ a \in A, b \in B \}
+{{< /latex >}}
+
+The fact that this definition of multiplication on sets is associative
+relies on the associativity of path concatenation; if path concatenation
+weren't associative, the second equality below would not hold.
+
+{{< latex >}}
+\begin{array}{rcl}
+A \times (B \times C) & = & \{ a \rightarrow (b \rightarrow c)\ |\ a \in A, b \in B, c \in C \} \\
+    & \stackrel{?}{=} & \{ (a \rightarrow b) \rightarrow c \ |\ a \in A, b \in B, c \in C \} \\
+    & = & (A \times B) \times C
+\end{array}
+{{< /latex >}}
+
+What's the multiplicative identity? Well, since multiplication concatenates
+all the combination of paths from two sets, we could try making a set of
+elements that don't do anything when concatenating. Sound familiar? It should,
+that's \\(\\circ\\), the empty path element! We thus define our multiplicative
+identity as \\(\\{\\circ\\}\\), and verify that it is indeed the identity:
+
+{{< latex >}}
+\begin{gathered}
+\{\circ\} \times A = \{ \circ \rightarrow a\ |\ a \rightarrow A \} = \{ a \ |\  a \in A \} = A \\
+A \times \{\circ\}= \{ a\rightarrow \circ \ |\ a \rightarrow A \} = \{ a \ |\  a \in A \} = A
+\end{gathered}
+{{< /latex >}}
+
+It's not too difficult to verify the annihilation and distribution laws for
+sets of paths, either; I won't do that here, though. Finally, let's take
+a look at an example. Like before, we'll try make one that corresponds to
+our introductory description of paths from A to B and from B to C. Now we need
+to be a little bit creative, and come up with names for all these different
+roads between our hypothetical cities. Let's say that \\(\\text{Highway A}\\)
+and \\(\\text{Highway B}\\) are the two paths from A to B that take two hours
+each, and then \\(\\text{Shortcut}\\) is the path that takes one hour. As for
+paths from B to C, let's just call them \\(\\text{Long}\\) for the three-hour
+path, and \\(\\text{Short}\\) for the two-hour path. Our two polynomials
+are then:
+
+{{< latex >}}
+\begin{array}{rcl}
+P_1 & = & \{\text{Highway A}, \text{Highway B}\}x^2 + \{\text{Shortcut}\}x \\
+P_2 & = & \{\text{Long}\}x^3 + \{\text{Short}\}x^2
+\end{array}
+{{< /latex >}}
+
+Multiplying them gives:
+{{< latex >}}
+\begin{array}{rl}
+    & \{\text{Highway A} \rightarrow \text{Long}, \text{Highway B} \rightarrow \text{Long}\}x^5\\
+    + & \{\text{Highway A} \rightarrow \text{Short}, \text{Highway B} \rightarrow \text{Short}, \text{Shortcut} \rightarrow \text{Long}\}x^4\\
+    + & \{\text{Shortcut} \rightarrow \text{Short}\}x^3
+\end{array}
+{{< /latex >}}
+
+This resulting polynomial gives us all the paths from city A to city C,
+grouped by their length!