diff --git a/content/blog/search_polynomials.md b/content/blog/search_polynomials.md index eb9889f..3370121 100644 --- a/content/blog/search_polynomials.md +++ b/content/blog/search_polynomials.md @@ -13,14 +13,24 @@ rarely the target audience on this site. However, one particular insight I gleaned from the paper merits additional discussion and demonstration. I'm going to do that here. -We can start with something concrete. Suppose that you're trying to get from -city A to city B, and then from city B to city C. Also suppose that your -trips are measured in one-hour intervals, and that trips of equal duration are -considered equivalent. Given possible routes from A to B, and then given more -routes from B to C, what are the possible routes from A to C you can build up? +In particular, the paper pointed out a connection between polynomials and a +general concept of _search_. In the context of the paper, "search" simply +referred to a way of finding various solutions to some problem, perhaps +like "what are the ways of getting from one place to another?". In this +case, a search would be a computation that explores the space of possible +routes. -In many cases, starting with an example helps build intuition. Maybe there -are two routes from A to B that take two hours each, and one "quick" trip +That all sounds very abstract, so let's start with a concrete example. +Suppose that you're trying to get from city A to city B, and then from city B +to city C. Also suppose that your trips are measured in one-hour intervals +(maybe you round trip lengths, turning 2:45 into 3 hours), and that trips of +equal duration are considered equivalent ("as long as it gets me there!"). +Now, I give you a list of possible routes from city A to city B, and +another list of possible routes from city B to city C, grouped by their length. +Given these two lists, what are the possible routes from A to C? + +Let's make this even more concrete, and start with some actual lists of routes. +Maybe there are two routes from A to B that take two hours each, and one "quick" trip that takes only an hour. On top of this, there's one three-hour trip from B to C, and one two-hour trip. Given these building blocks, the list of possible trips from A to C is as follows. @@ -40,7 +50,7 @@ our final report, we need to "combine like terms" - add up the trips from the two matching bullet points, ending up with total of three four-hour trips. Does this feel a little bit familiar? To me, this bears a rather striking -resemblance to an operation we've seen in algebra class: we're multiplying +resemblance to an operation we've seen in high school algebra class: we're multiplying two binomials! Here's the corresponding multiplication: {{< latex >}} @@ -60,10 +70,38 @@ trips from A to B, then adding them just combines the list. If I know one trip that takes two hours (\\(x^2\\)) and someone else knows a shortcut (\\(x\\\)), then we can combine that knowledge (\\(x^2+x\\)). -Well, that's a neat little thing. But we can push this observation a bit -further. To generalize what we've already seen, however, we'll need to -figure out "the bare minimum" of what we need to make polynomial -multiplication work as we'd expect. +{{< dialog >}} +{{< message "question" "reader" >}} +Wait a moment. Sure, we learned about polynomials in algebra class: they're +functions! You put in a number for \(x\), and get another number out. +But you haven't done that, and in fact you haven't even mentioned +functions at all. What's going on? +{{< /message >}} +{{< message "answer" "Daniel" >}} +In this article (and in the paper it's based on), polynomials are viewed in +a more general way than you might be used to. The point isn't to think of +them as defining functions on numbers, but to make use of their "shape": a sum +of certain powers of \(x\), like \(ax^n+bx^m+...\) +{{< /message >}} +{{< message "question" "reader" >}} +So we won't be plugging numbers in, or trying to graph the polynomials in +this section? +{{< /message >}} +{{< message "answer" "Daniel" >}} +That's right, we won't be. The sort of thing we're doing here is a bit +closer to abstract algebra +than to high school math. Don't worry if you're not familiar with the +subject, though: I'm trying to explain everything from first principles. +{{< /message >}} +{{< /dialog >}} + +Well, it's a neat little thing that tracking trips corresponds to adding +and mulitpying polynomials like that. We can push this observation a bit +further, though. Since our trick relies on multiplying two polynomials, +we'll need to better understand what that multiplication needs to behave as we +expect. In particular, we'll need to know what the "bare minimum" is for +working with polynomial: what arithmetic properties must we bring to the table? +Let's take a look at that next. ### Polynomials over Semirings Let's watch what happens when we multiply two binomials, paying really close @@ -92,7 +130,38 @@ we didn't use it here) is that multiplication has to be associative, too. So, what if we didn't use numbers, but rather any _thing_ with two operations, one kind of like \\((\\times)\\) and one kind of like \\((+)\\)? -As long as these operations satisfy the properties we have used so far, we + +{{< dialog >}} +{{< message "question" "reader" >}} +Here, it seems like you're saying that in the polynomials we've seen so +far, it's numbers themselves that need to be commutative, associative, etc.. +{{< /message >}} +{{< message "answer" "Daniel" >}} +That's right, I am saying that. We need the \((+)\) and \((\times)\) +operations on numbers to follow the laws I laid out above. +{{< /message >}} +{{< message "question" "reader" >}} +Okay, but in your equations above, it's not just numbers that were moved +around using commutativity and associativity: it was variables, like \(x\). +Just earlier you said that we're thinking of the polynomials in terms of +their "shape", and not as functions. If that's the case, why we allowed to +blur the lines between polynomial and number like that? +{{< /message >}} +{{< message "answer" "Daniel" >}} +Good question. If you want to get really precise, in the abstract view, +adding numbers is not quite the same as adding polynomials. Because of this, +saying that addition commutes for numbers does not immediately tel +us that it commutes for something like \(x\). However, also in the abstract +view, we define how addition and multiplication on polynomials work +using addition and multiplication numbers. Thus, properties of +numbers make their way into properties of polynomials. +{{< /message >}} +{{< /dialog >}} + + +As I was saying, what if we used some other kind of _thing_ other than +numbers, together with notions of what it means to "add" and "multiply" +this _thing_? As long as these operations satisfy the properties we have used so far, we should be able to create polynomials using them, and do this same sort of "combining paths" we did earlier. Before we get to that, let me just say that "things with addition and multiplication that work in the way we @@ -368,9 +437,8 @@ This resulting polynomial gives us all the paths from city A to city C, grouped by their length! #### The Tropical Semiring, \\(\\mathbb{R}\\) -I only have one last semiring left to show you before we move on to something -other than paths between cities. It's a fun semiring though, as even its name -might suggest: we'll take a look at a _tropical semiring_. +I only have one last semiring left to show you. It's a fun semiring though, +as even its name might suggest: we'll take a look at a _tropical semiring_. In this semiring, we go back to numbers; particularly, real numbers (e.g., \\(1.34\\), \\(163\\), \\(e\\), that kind of thing). We even use addition --