Make some edits to the polynomial draft

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
Danila Fedorin 2023-05-22 20:44:50 -07:00
parent 54dccdbc7d
commit 00bec06012
1 changed files with 84 additions and 16 deletions

View File

@ -13,14 +13,24 @@ rarely the target audience on this site. However, one particular insight I
gleaned from the paper merits additional discussion and demonstration. I'm
going to do that here.
We can start with something concrete. Suppose that you're trying to get from
city A to city B, and then from city B to city C. Also suppose that your
trips are measured in one-hour intervals, and that trips of equal duration are
considered equivalent. Given possible routes from A to B, and then given more
routes from B to C, what are the possible routes from A to C you can build up?
In particular, the paper pointed out a connection between polynomials and a
general concept of _search_. In the context of the paper, "search" simply
referred to a way of finding various solutions to some problem, perhaps
like "what are the ways of getting from one place to another?". In this
case, a search would be a computation that explores the space of possible
routes.
In many cases, starting with an example helps build intuition. Maybe there
are two routes from A to B that take two hours each, and one "quick" trip
That all sounds very abstract, so let's start with a concrete example.
Suppose that you're trying to get from city A to city B, and then from city B
to city C. Also suppose that your trips are measured in one-hour intervals
(maybe you round trip lengths, turning 2:45 into 3 hours), and that trips of
equal duration are considered equivalent ("as long as it gets me there!").
Now, I give you a list of possible routes from city A to city B, and
another list of possible routes from city B to city C, grouped by their length.
Given these two lists, what are the possible routes from A to C?
Let's make this even more concrete, and start with some actual lists of routes.
Maybe there are two routes from A to B that take two hours each, and one "quick" trip
that takes only an hour. On top of this, there's one three-hour trip from B
to C, and one two-hour trip. Given these building blocks, the list of
possible trips from A to C is as follows.
@ -40,7 +50,7 @@ our final report, we need to "combine like terms" - add up the trips from
the two matching bullet points, ending up with total of three four-hour trips.
Does this feel a little bit familiar? To me, this bears a rather striking
resemblance to an operation we've seen in algebra class: we're multiplying
resemblance to an operation we've seen in high school algebra class: we're multiplying
two binomials! Here's the corresponding multiplication:
{{< latex >}}
@ -60,10 +70,38 @@ trips from A to B, then adding them just combines the list. If I know one trip
that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)),
then we can combine that knowledge (\\(x^2+x\\)).
Well, that's a neat little thing. But we can push this observation a bit
further. To generalize what we've already seen, however, we'll need to
figure out "the bare minimum" of what we need to make polynomial
multiplication work as we'd expect.
{{< dialog >}}
{{< message "question" "reader" >}}
Wait a moment. Sure, we learned about polynomials in algebra class: they're
functions! You put in a number for \(x\), and get another number out.
But you haven't done that, and in fact you haven't even mentioned
functions at all. What's going on?
{{< /message >}}
{{< message "answer" "Daniel" >}}
In this article (and in the paper it's based on), polynomials are viewed in
a more general way than you might be used to. The point isn't to think of
them as defining functions on numbers, but to make use of their "shape": a sum
of certain powers of \(x\), like \(ax^n+bx^m+...\)
{{< /message >}}
{{< message "question" "reader" >}}
So we won't be plugging numbers in, or trying to graph the polynomials in
this section?
{{< /message >}}
{{< message "answer" "Daniel" >}}
That's right, we won't be. The sort of thing we're doing here is a bit
closer to <a href="https://en.wikipedia.org/wiki/Abstract_algebra">abstract algebra</a>
than to high school math. Don't worry if you're not familiar with the
subject, though: I'm trying to explain everything from first principles.
{{< /message >}}
{{< /dialog >}}
Well, it's a neat little thing that tracking trips corresponds to adding
and mulitpying polynomials like that. We can push this observation a bit
further, though. Since our trick relies on multiplying two polynomials,
we'll need to better understand what that multiplication needs to behave as we
expect. In particular, we'll need to know what the "bare minimum" is for
working with polynomial: what arithmetic properties must we bring to the table?
Let's take a look at that next.
### Polynomials over Semirings
Let's watch what happens when we multiply two binomials, paying really close
@ -92,7 +130,38 @@ we didn't use it here) is that multiplication has to be associative, too.
So, what if we didn't use numbers, but rather any _thing_ with two
operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)?
As long as these operations satisfy the properties we have used so far, we
{{< dialog >}}
{{< message "question" "reader" >}}
Here, it seems like you're saying that in the polynomials we've seen so
far, it's numbers themselves that need to be commutative, associative, etc..
{{< /message >}}
{{< message "answer" "Daniel" >}}
That's right, I am saying that. We need the \((+)\) and \((\times)\)
operations on numbers to follow the laws I laid out above.
{{< /message >}}
{{< message "question" "reader" >}}
Okay, but in your equations above, it's not just numbers that were moved
around using commutativity and associativity: it was variables, like \(x\).
Just earlier you said that we're thinking of the polynomials in terms of
their "shape", and not as functions. If that's the case, why we allowed to
blur the lines between polynomial and number like that?
{{< /message >}}
{{< message "answer" "Daniel" >}}
Good question. If you want to get really precise, in the abstract view,
adding numbers is not quite the same as adding polynomials. Because of this,
saying that addition commutes for numbers does not <em>immediately</em> tel
us that it commutes for something like \(x\). However, also in the abstract
view, we define how addition and multiplication on polynomials work
<em>using</em> addition and multiplication numbers. Thus, properties of
numbers make their way into properties of polynomials.
{{< /message >}}
{{< /dialog >}}
As I was saying, what if we used some other kind of _thing_ other than
numbers, together with notions of what it means to "add" and "multiply"
this _thing_? As long as these operations satisfy the properties we have used so far, we
should be able to create polynomials using them, and do this same sort of
"combining paths" we did earlier. Before we get to that, let me just say
that "things with addition and multiplication that work in the way we
@ -368,9 +437,8 @@ This resulting polynomial gives us all the paths from city A to city C,
grouped by their length!
#### The Tropical Semiring, \\(\\mathbb{R}\\)
I only have one last semiring left to show you before we move on to something
other than paths between cities. It's a fun semiring though, as even its name
might suggest: we'll take a look at a _tropical semiring_.
I only have one last semiring left to show you. It's a fun semiring though,
as even its name might suggest: we'll take a look at a _tropical semiring_.
In this semiring, we go back to numbers; particularly, real numbers (e.g.,
\\(1.34\\), \\(163\\), \\(e\\), that kind of thing). We even use addition --