406 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			406 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| ---
 | |
| title: "Time Traveling In Haskell: How It Works And How To Use It"
 | |
| date: 2020-05-03T20:05:29-07:00
 | |
| tags: ["Haskell"]
 | |
| draft: true
 | |
| ---
 | |
| 
 | |
| <style>
 | |
| img, figure.small img { max-height: 20rem; }
 | |
| figure.medium img { max-height: 30rem; }
 | |
| </style>
 | |
| 
 | |
| I recently got to use a very curious Haskell technique
 | |
| {{< sidenote "right" "production-note" "in production:" >}}
 | |
| As production as research code gets, anyway!
 | |
| {{< /sidenote >}} time traveling. I say this with
 | |
| the utmost seriousness. This technique worked like
 | |
| magic for the problem I was trying to solve (which isn't
 | |
| interesting enough to be presented here in itself), and so
 | |
| I thought I'd share what I learned. In addition
 | |
| to the technique and its workings, I will also explain how 
 | |
| time traveling can be misused, yielding computations that
 | |
| never terminate.
 | |
| 
 | |
| ### Time Traveling
 | |
| Some time ago, I read [this post](https://kcsongor.github.io/time-travel-in-haskell-for-dummies/) by Csongor Kiss about time traveling in Haskell. It's
 | |
| really cool, and makes a lot of sense if you have wrapped your head around
 | |
| lazy evaluation. I'm going to present my take on it here, but please check out
 | |
| Csongor's original post if you are interested.
 | |
| 
 | |
| Say that you have a list of integers, like `[3,2,6]`. Next, suppose that
 | |
| you want to find the maximum value in the list. You can implement such
 | |
| behavior quite simply with pattern matching:
 | |
| 
 | |
| ```Haskell
 | |
| myMax :: [Int] -> Int
 | |
| myMax [] = error "Being total sucks"
 | |
| myMax (x:xs) = max x $ myMax xs
 | |
| ```
 | |
| 
 | |
| You could even get fancy with a `fold`:
 | |
| 
 | |
| ```Haskell
 | |
| myMax :: [Int] -> Int
 | |
| myMax = foldr1 max
 | |
| ```
 | |
| 
 | |
| All is well, and this is rather elementary Haskell. But now let's look at
 | |
| something that Csongor calls the `repMax` problem:
 | |
| 
 | |
| > Imagine you had a list, and you wanted to replace all the elements of the
 | |
| > list with the largest element, by only passing the list once.
 | |
| 
 | |
| How can we possibly do this in one pass? First, we need to find the maximum
 | |
| element, and only then can we have something to replace the other numbers
 | |
| with! It turns out, though, that we can just expect to have the future
 | |
| value, and all will be well. Csongor provides the following example:
 | |
| 
 | |
| ```Haskell
 | |
| repMax :: [Int] -> Int -> (Int, [Int])
 | |
| repMax [] rep = (rep, [])
 | |
| repMax [x] rep = (x, [rep])
 | |
| repMax (l : ls) rep = (m', rep : ls')
 | |
|   where (m, ls') = repMax ls rep
 | |
|         m' = max m l
 | |
| ```
 | |
| In this example, `repMax` takes the list of integers,
 | |
| each of which it must replace with their maximum element.
 | |
| It also takes __as an argument__ the maximum element,
 | |
| as if it had already been computed. It does, however,
 | |
| still compute the intermediate maximum element,
 | |
| in the form of `m'`. Otherwise, where would the future
 | |
| value even come from?
 | |
| 
 | |
| Thus far, nothing too magical has happened. It's a little
 | |
| strange to expect the result of the computation to be
 | |
| given to us; however, thus far, it looks like wishful
 | |
| thinking. The real magic happens in Csongor's `doRepMax`
 | |
| function:
 | |
| 
 | |
| ```Haskell
 | |
| doRepMax :: [Int] -> [Int]
 | |
| doRepMax xs = xs'
 | |
|   where (largest, xs') = repMax xs largest
 | |
| ```
 | |
| 
 | |
| Look, in particular, on the line with the `where` clause.
 | |
| We see that `repMax` returns the maximum element of the
 | |
| list, `largest`, and the resulting list `xs'` consisting
 | |
| only of `largest` repeated as many times as `xs` had elements.
 | |
| But what's curious is the call to `repMax` itself. It takes
 | |
| as input `xs`, the list we're supposed to process... and
 | |
| `largest`, the value that _it itself returns_.
 | |
| 
 | |
| This works because Haskell's evaluation model is, effectively,
 | |
| [lazy graph reduction](https://en.wikipedia.org/wiki/Graph_reduction). That is,
 | |
| you can think of Haskell as manipulating your code as
 | |
| {{< sidenote "right" "tree-note" "a syntax tree," >}}
 | |
| Why is it called graph reduction, you may be wondering, if the runtime is
 | |
| manipulating syntax trees? To save on work, if a program refers to the
 | |
| same value twice, Haskell has both of those references point to the
 | |
| exact same graph. This violates the tree's property of having only one path
 | |
| from the root to any node, and makes our program a graph. Graphs that
 | |
| refer to themselves also violate the properties of a tree.
 | |
| {{< /sidenote >}} performing
 | |
| substitutions and simplifications as necessary until it reaches a final answer.
 | |
| What the lazy part means is that parts of the syntax tree that are not yet
 | |
| needed to compute the final answer can exist, unsimplified, in the tree.
 | |
| Why don't we draw a few graphs to get familiar with the idea?
 | |
| 
 | |
| ### Visualizing Graphs and Their Reduction
 | |
| Let's start with something that doesn't have anything fancy. We can
 | |
| take a look at the graph of the expression:
 | |
| 
 | |
| ```Haskell
 | |
| length [1]
 | |
| ```
 | |
| 
 | |
| Stripping away Haskell's syntax sugar for lists, we can write this expression as:
 | |
| 
 | |
| ```Haskell
 | |
| length (1:[])
 | |
| ```
 | |
| 
 | |
| Then, recalling that `(:)`, or 'cons', is just a binary function, we rewrite:
 | |
| 
 | |
| ```Haskell
 | |
| length ((:) 1 [])
 | |
| ```
 | |
| 
 | |
| We're now ready to draw the graph; in this case, it's pretty much identical
 | |
| to the syntax tree of the last form of our expression:
 | |
| 
 | |
| {{< figure src="length_1.png" caption="The initial graph of `length [1]`." >}}
 | |
| 
 | |
| In this image, the `@` nodes represent function application. The
 | |
| root node is an application of the function `length` to the graph that
 | |
| represents the list `[1]`. The list itself is represented using two
 | |
| application nodes: `(:)` takes two arguments, the head and tail of the
 | |
| list, and function applications in Haskell are
 | |
| [curried](https://en.wikipedia.org/wiki/Currying). Eventually,
 | |
| in the process of evaluation, the body of `length` will be reached,
 | |
| and leave us with the following graph:
 | |
| 
 | |
| {{< figure src="length_2.png" caption="The graph of `length [1]` after the body of `length` is expanded." >}}
 | |
| 
 | |
| Conceptually, we only took one reduction step, and thus, we haven't yet gotten
 | |
| to evaluating the recursive call to `length`. Since `(+)`
 | |
| is also a binary function, `1+length xs` is represented in this
 | |
| new graph as two applications of `(+)`, first to `1`, and then
 | |
| to `length []`.
 | |
| 
 | |
| But what is that box at the root? This box _used to be_ the root of the
 | |
| first graph, which was an application node. However, it is now a
 | |
| an _indirection_. Conceptually, reducing
 | |
| this indirection amounts to reducing the graph
 | |
| it points to. But why have we {{< sidenote "right" "altered-note" "altered the graph" >}}
 | |
| This is a key aspect of implementing functional languages.
 | |
| The language itself may be pure, while the runtime
 | |
| can be, and usually is, impure and stateful. After all,
 | |
| computers are impure and stateful, too!
 | |
| {{< /sidenote >}} in this manner? Because Haskell is a pure language,
 | |
| of course! If we know that a particular graph reduces to some value,
 | |
| there's no reason to reduce it again. However, as we will
 | |
| soon see, it may be _used_ again, so we want to preserve its value.
 | |
| Thus, when we're done reducing a graph, we replace its root node with
 | |
| an indirection that points to its result. 
 | |
| 
 | |
| When can a graph be used more than once? Well, how about this:
 | |
| 
 | |
| ```Haskell
 | |
| let x = square 5 in x + x
 | |
| ```
 | |
| 
 | |
| Here, the initial graph looks as follows:
 | |
| 
 | |
| {{< figure src="square_1.png" caption="The initial graph of `let x = square 5 in x + x`." >}}
 | |
| 
 | |
| As you can see, this _is_ a graph, but not a tree! Since both
 | |
| variables `x` refer to the same expression, `square 5`, they
 | |
| are represented by the same subgraph. Then, when we evaluate `square 5`
 | |
| for the first time, and replace its root node with an indirection,
 | |
| we end up with the following:
 | |
| 
 | |
| {{< figure src="square_2.png" caption="The graph of `let x = square 5 in x + x` after `square 5` is reduced." >}}
 | |
| 
 | |
| There are two `25`s in the tree, and no more `square`s! We only
 | |
| had to evaluate `square 5` exactly once, even though `(+)`
 | |
| will use it twice (once for the left argument, and once for the right).
 | |
| 
 | |
| Our graphs can also include cycles.
 | |
| A simple, perhaps _the most_ simple example of this in practice is Haskell's
 | |
| `fix` function. It computes a function's fixed point,
 | |
| {{< sidenote "right" "fixpoint-note" "and can be used to write recursive functions." >}}
 | |
| In fact, in the lambda calculus, <code>fix</code> is pretty much <em>the only</em>
 | |
| way to write recursive functions. In the untyped lambda calculus, it can
 | |
| be written as: $$\lambda f . (\lambda x . f (x \ x)) \  (\lambda x . f (x \ x))$$
 | |
| In the simply typed lambda calculus, it cannot be written in any way, and
 | |
| needs to be added as an extension, typically written as \(\textbf{fix}\).
 | |
| {{< /sidenote >}}
 | |
| It's implemented as follows:
 | |
| 
 | |
| ```Haskell
 | |
| fix f = let x = f x in x
 | |
| ```
 | |
| 
 | |
| See how the definition of `x` refers to itself? This is what
 | |
| it looks like in graph form:
 | |
| 
 | |
| {{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." >}}
 | |
| 
 | |
| I think it's useful to take a look at how this graph is processed. Let's
 | |
| pick `f = (1:)`. That is, `f` is a function that takes a list,
 | |
| and prepends `1` to it. Then, after constructing the graph of `f x`,
 | |
| we end up with the following:
 | |
| 
 | |
| {{< figure src="fixpoint_2.png" caption="The graph of `fix (1:)` after it's been reduced." >}}
 | |
| 
 | |
| We see the body of `f`, which is the application of `(:)` first to the
 | |
| constant `1`, and then to `f`'s argument (`x`, in this case). As
 | |
| before, once we evaluated `f x`, we replaced the application with
 | |
| an indirection; in the image, this indirection is the top box. But the
 | |
| argument, `x`, is itself an indirection which points to the root of `f x`,
 | |
| thereby creating a cycle in our graph. 
 | |
| 
 | |
| Almost there! A node can refer to itself, and, when evaluated, it
 | |
| is replaced with its own value. Thus, a node can effectively reference
 | |
| its own value! The last piece of the puzzle is how a node can access
 | |
| _parts_ of its own value: recall that `doRepMax` calls `repMax`
 | |
| with only `largest`, while `repMax` returns `(largest, xs')`.
 | |
| I have to admit, I don't know the internals of GHC, but I suspect
 | |
| this is done by translating the code into something like:
 | |
| 
 | |
| ```Haskell
 | |
| doRepMax :: [Int] -> [Int]
 | |
| doRepMax xs = snd t
 | |
|   where t = repMax xs (fst t)
 | |
| ```
 | |
| 
 | |
| ### Detailed Example: Reducing `doRepMax`
 | |
| 
 | |
| If the above examples haven't elucidated how `doRepMax` works,
 | |
| stick around in this section and we will go through it step-by-step.
 | |
| This is a rather long and detailed example, so feel free to skip
 | |
| this section to read more about actually using time traveling.
 | |
| 
 | |
| If you're sticking around, why don't we watch how the graph of `doRepMax [1, 2]` unfolds.
 | |
| This example will be more complex than the ones we've seen
 | |
| so far; to avoid overwhelming ourselves with notation,
 | |
| let's adopt a different convention of writing functions. Instead
 | |
| of using application nodes `@`, let's draw an application of a
 | |
| function `f` to arguments `x1` through `xn` as a subgraph with root `f`
 | |
| and children `x`s. The below figure demonstrates what I mean:
 | |
| 
 | |
| {{< figure src="notation.png" caption="The new visual notation used in this section." >}}
 | |
| 
 | |
| Now, let's write the initial graph for `doRepMax [1,2]`:
 | |
| 
 | |
| {{< figure src="repmax_1.png" caption="The initial graph of `doRepMax [1,2]`." >}}
 | |
| 
 | |
| Other than our new notation, there's nothing too surprising here.
 | |
| At a high level, all we want is the second element of the tuple
 | |
| returned by `repMax`, which contains the output list. To get
 | |
| the tuple, we apply `repMax` to the list `[1,2]`, which itself
 | |
| consists of two uses of the `(:)` function.
 | |
| 
 | |
| The first step
 | |
| of our hypothetical reduction would replace the application of `doRepMax` with its
 | |
| body, and create our graph's first cycle:
 | |
| 
 | |
| {{< figure src="repmax_2.png" caption="The first step of reducing `doRepMax [1,2]`." >}}
 | |
| 
 | |
| Next, we would do the same for the body of `repMax`. In
 | |
| the following diagram, to avoid drawing a noisy amount of
 | |
| crossing lines, I marked the application of `fst` with
 | |
| a star, and replaced the two edges to `fst` with
 | |
| edges to similar looking stars. This is merely
 | |
| a visual trick; an edge leading to a little star is
 | |
| actually an edge leading to `fst`. Take a look:
 | |
| 
 | |
| {{< figure src="repmax_3.png" caption="The second step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| Since `(,)` is a constructor, let's say that it doesn't
 | |
| need to be evaluated, and that its
 | |
| {{< sidenote "right" "normal-note" "graph cannot be reduced further" >}}
 | |
| A graph that can't be reduced further is said to be in <em>normal form</em>,
 | |
| by the way.
 | |
| {{< /sidenote >}} (in practice, other things like
 | |
| packing may occur here, but they are irrelevant to us).
 | |
| If `(,)` can't be reduced, we can move on to evaluating `snd`. Given a pair, `snd`
 | |
| simply returns the second element, which in our
 | |
| case is the subgraph starting at `(:)`. We
 | |
| thus replace the application of `snd` with an
 | |
| indirection to this subgraph. This leaves us
 | |
| with the following:
 | |
| 
 | |
| {{< figure src="repmax_4.png" caption="The third step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| Since it's becoming hard to keep track of what part of the graph
 | |
| is actually being evaluated, I marked the former root of `doRepMax [1,2]` with
 | |
| a blue star. If our original expression occured at the top level,
 | |
| the graph reduction would probably stop here.  After all,
 | |
| we're evaluating our graphs using call-by-need, and there
 | |
| doesn't seem to be a need for knowing the what the arguments of `(:)` are.
 | |
| However, stopping at `(:)` wouldn't be very interesting,
 | |
| and we wouldn't learn much from doing so. So instead, let's assume
 | |
| that _something_ is trying to read the elements of our list;
 | |
| perhaps we are trying to print this list to the screen in GHCi.
 | |
| 
 | |
| In this case, our mysterious external force starts unpacking and
 | |
| inspecting the arguments to `(:)`. The first argument to `(:)` is
 | |
| the list's head, which is the subgraph starting with the starred application
 | |
| of `fst`. We evaluate it in a similar manner to `snd`. That is,
 | |
| we replace this `fst` with an indirection to the first element
 | |
| of the argument tuple, which happens to be the subgraph starting with `max`:
 | |
| 
 | |
| {{< figure src="repmax_5.png" caption="The fourth step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| Phew! Next, we need to evaluate the body of `max`. Let's make one more
 | |
| simplification here: rather than substitututing `max` for its body
 | |
| here, let's just reason about what evaluating `max` would entail.
 | |
| We would need to evaluate its two arguments, compare them,
 | |
| and return the larger one. The argument `1` can't be reduced
 | |
| any more (it's just a number!), but the second argument,
 | |
| a call to `fst`, needs to be processed. To do so, we need to
 | |
| evaluate the call to `repMax`. We thus replace `repMax`
 | |
| with its body:
 | |
| 
 | |
| {{< figure src="repmax_6.png" caption="The fifth step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| We've reached one of the base cases here, and there
 | |
| are no more calls to `max` or `repMax`. The whole reason
 | |
| we're here is to evaluate the call to `fst` that's one
 | |
| of the arguments to `max`. Given this graph, doing so is easy.
 | |
| We can clearly see that `2` is the first element of the tuple
 | |
| returned by `repMax [2]`. We thus replace `fst` with
 | |
| an indirection to this node:
 | |
| 
 | |
| {{< figure src="repmax_7.png" caption="The sixth step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| This concludes our task of evaluating the arguments to `max`.
 | |
| Comparing them, we see that `2` is greater than `1`, and thus,
 | |
| we replace `max` with an indirection to `2`:
 | |
| 
 | |
| {{< figure src="repmax_8.png" caption="The seventh step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| The node that we starred in our graph is now an indirection (the
 | |
| one that used to be the call to `fst`) which points to
 | |
| another indirection (formerly the call to `max`), which
 | |
| points to `2`. Thus, any edge pointing to a star now
 | |
| points to the value 2. 
 | |
| 
 | |
| By finding the value of the starred node, we have found the first
 | |
| argument of `(:)`, and returned it to our mysterious external force.
 | |
| If we were printing to GHCi, the number `2` would appear on the screen
 | |
| right about now. The force then moves on to the second argument of `(:)`,
 | |
| which is the call to `snd`. This `snd` is applied to an instance of `(,)`, which
 | |
| can't be reduced any further. Thus, all we have to do is take the second
 | |
| element of the tuple, and replace `snd` with an indirection to it:
 | |
| 
 | |
| {{< figure src="repmax_9.png" caption="The eighth step of reducing `doRepMax [1,2]`." class="medium" >}}
 | |
| 
 | |
| The second element of the tuple was a call to `(:)`, and that's what the mysterious
 | |
| force is processing now. Just like it did before, it starts by looking at the
 | |
| first argument of this list, which is head. This argument is a reference to
 | |
| the starred node, which, as we've established, eventually points to `2`.
 | |
| Another `2` pops up on the console.
 | |
| 
 | |
| Finally, the mysterious force reaches the second argument of the `(:)`,
 | |
| which is the empty list. The empty list also cannot be evaluated any
 | |
| further, so that's what the mysterious force receives. Just like that,
 | |
| there's nothing left to print to the console. The mysterious force ceases.
 | |
| After removing the unused nodes, we are left with the following graph:
 | |
| 
 | |
| {{< figure src="repmax_10.png" caption="The result of reducing `doRepMax [1,2]`." >}}
 | |
| 
 | |
| As we would have expected, two `2`s are printed to the console.
 | |
| 
 | |
| ### Using Time Traveling
 | |
| 
 | |
| {{< todo >}}This whole section {{< /todo >}}
 | |
| 
 | |
| ### Beware The Strictness
 | |
| 
 | |
| {{< todo >}}This whole section, too. {{< /todo >}}
 | |
| 
 | |
| ### Leftovers
 | |
| 
 | |
| This is
 | |
| what allows us to write the code above: the graph of `repMax xs largest`
 | |
| effectively refers to itself. While traversing the list, it places references
 | |
| to itself in place of each of the elements, and thanks to laziness, these
 | |
| references are not evaluated.
 | |
| 
 | |
| Let's try a more complicated example. How about instead of creating a new list,
 | |
| we return a `Map` containing the number of times each number occured, but only
 | |
| when those numbers were a factor of the maximum numbers. Our expected output
 | |
| will be:
 | |
| 
 | |
| ```
 | |
| >>> countMaxFactors [1,3,3,9]
 | |
| 
 | |
| fromList [(1, 1), (3, 2), (9, 1)]
 | |
| ```
 | |
| 
 |