Update lazy evaluation post with images and more.
| Before Width: | Height: | Size: 28 KiB After Width: | Height: | Size: 27 KiB | 
| Before Width: | Height: | Size: 53 KiB After Width: | Height: | Size: 48 KiB | 
| @ -7,6 +7,7 @@ draft: true | ||||
| 
 | ||||
| <style> | ||||
| img, figure.small img { max-height: 20rem; } | ||||
| figure.tiny img { max-height: 15rem; } | ||||
| figure.medium img { max-height: 30rem; } | ||||
| </style> | ||||
| 
 | ||||
| @ -15,8 +16,7 @@ I recently got to use a very curious Haskell technique | ||||
| As production as research code gets, anyway! | ||||
| {{< /sidenote >}} time traveling. I say this with | ||||
| the utmost seriousness. This technique worked like | ||||
| magic for the problem I was trying to solve (which isn't | ||||
| interesting enough to be presented here in itself), and so | ||||
| magic for the problem I was trying to solve, and so | ||||
| I thought I'd share what I learned. In addition | ||||
| to the technique and its workings, I will also explain how  | ||||
| time traveling can be misused, yielding computations that | ||||
| @ -74,7 +74,7 @@ value even come from? | ||||
| 
 | ||||
| Thus far, nothing too magical has happened. It's a little | ||||
| strange to expect the result of the computation to be | ||||
| given to us; however, thus far, it looks like wishful | ||||
| given to us; it just looks like wishful | ||||
| thinking. The real magic happens in Csongor's `doRepMax` | ||||
| function: | ||||
| 
 | ||||
| @ -100,8 +100,9 @@ Why is it called graph reduction, you may be wondering, if the runtime is | ||||
| manipulating syntax trees? To save on work, if a program refers to the | ||||
| same value twice, Haskell has both of those references point to the | ||||
| exact same graph. This violates the tree's property of having only one path | ||||
| from the root to any node, and makes our program a graph. Graphs that | ||||
| refer to themselves also violate the properties of a tree. | ||||
| from the root to any node, and makes our program a DAG (at least). Graph nodes that | ||||
| refer to themselves (which are also possible in the model) also violate the properties of a | ||||
| a DAG, and thus, in general, we are working with graphs. | ||||
| {{< /sidenote >}} performing | ||||
| substitutions and simplifications as necessary until it reaches a final answer. | ||||
| What the lazy part means is that parts of the syntax tree that are not yet | ||||
| @ -184,7 +185,7 @@ we end up with the following: | ||||
| 
 | ||||
| {{< figure src="square_2.png" caption="The graph of `let x = square 5 in x + x` after `square 5` is reduced." >}} | ||||
| 
 | ||||
| There are two `25`s in the tree, and no more `square`s! We only | ||||
| There are two `25`s in the graph, and no more `square`s! We only | ||||
| had to evaluate `square 5` exactly once, even though `(+)` | ||||
| will use it twice (once for the left argument, and once for the right). | ||||
| 
 | ||||
| @ -207,7 +208,7 @@ fix f = let x = f x in x | ||||
| See how the definition of `x` refers to itself? This is what | ||||
| it looks like in graph form: | ||||
| 
 | ||||
| {{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." >}} | ||||
| {{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." class="tiny" >}} | ||||
| 
 | ||||
| I think it's useful to take a look at how this graph is processed. Let's | ||||
| pick `f = (1:)`. That is, `f` is a function that takes a list, | ||||
| @ -221,7 +222,8 @@ constant `1`, and then to `f`'s argument (`x`, in this case). As | ||||
| before, once we evaluated `f x`, we replaced the application with | ||||
| an indirection; in the image, this indirection is the top box. But the | ||||
| argument, `x`, is itself an indirection which points to the root of `f x`, | ||||
| thereby creating a cycle in our graph.  | ||||
| thereby creating a cycle in our graph. Traversing this graph looks like | ||||
| traversing an infinite list of `1`s. | ||||
| 
 | ||||
| Almost there! A node can refer to itself, and, when evaluated, it | ||||
| is replaced with its own value. Thus, a node can effectively reference | ||||
| @ -259,18 +261,16 @@ Now, let's write the initial graph for `doRepMax [1,2]`: | ||||
| {{< figure src="repmax_1.png" caption="The initial graph of `doRepMax [1,2]`." >}} | ||||
| 
 | ||||
| Other than our new notation, there's nothing too surprising here. | ||||
| At a high level, all we want is the second element of the tuple | ||||
| The first step of our hypothetical reduction would replace the application of `doRepMax` with its | ||||
| body, and create our graph's first cycle. At a high level, all we want is the second element of the tuple | ||||
| returned by `repMax`, which contains the output list. To get | ||||
| the tuple, we apply `repMax` to the list `[1,2]`, which itself | ||||
| the tuple, we apply `repMax` to the list `[1,2]` and the first element | ||||
| of its result. The list `[1,2]` itself | ||||
| consists of two uses of the `(:)` function. | ||||
| 
 | ||||
| The first step | ||||
| of our hypothetical reduction would replace the application of `doRepMax` with its | ||||
| body, and create our graph's first cycle: | ||||
| 
 | ||||
| {{< figure src="repmax_2.png" caption="The first step of reducing `doRepMax [1,2]`." >}} | ||||
| 
 | ||||
| Next, we would do the same for the body of `repMax`. In | ||||
| Next, we would also expand the body of `repMax`. In | ||||
| the following diagram, to avoid drawing a noisy amount of | ||||
| crossing lines, I marked the application of `fst` with | ||||
| a star, and replaced the two edges to `fst` with | ||||
| @ -362,7 +362,7 @@ element of the tuple, and replace `snd` with an indirection to it: | ||||
| 
 | ||||
| The second element of the tuple was a call to `(:)`, and that's what the mysterious | ||||
| force is processing now. Just like it did before, it starts by looking at the | ||||
| first argument of this list, which is head. This argument is a reference to | ||||
| first argument of this list, which is the list's head. This argument is a reference to | ||||
| the starred node, which, as we've established, eventually points to `2`. | ||||
| Another `2` pops up on the console. | ||||
| 
 | ||||
| @ -374,32 +374,197 @@ After removing the unused nodes, we are left with the following graph: | ||||
| 
 | ||||
| {{< figure src="repmax_10.png" caption="The result of reducing `doRepMax [1,2]`." >}} | ||||
| 
 | ||||
| As we would have expected, two `2`s are printed to the console. | ||||
| As we would have expected, two `2`s were printed to the console, and our | ||||
| final graph represents the list `[2,2]`. | ||||
| 
 | ||||
| ### Using Time Traveling | ||||
| Is time tarveling even useful? I would argue yes, especially | ||||
| in cases where Haskell's purity can make certain things | ||||
| difficult. | ||||
| 
 | ||||
| {{< todo >}}This whole section {{< /todo >}} | ||||
| As a first example, Csongor provides an assembler that works | ||||
| in a single pass. The challenge in this case is to resolve | ||||
| jumps to code segments occuring _after_ the jump itself; | ||||
| in essence, the address of the target code segment needs to be | ||||
| known before the segment itself is processed. Csongor's | ||||
| code uses the [Tardis monad](https://hackage.haskell.org/package/tardis-0.4.1.0/docs/Control-Monad-Tardis.html), | ||||
| which combines regular state, to which you can write and then | ||||
| later read from, and future state, from which you can | ||||
| read values before your write them. Check out | ||||
| [his complete example](https://kcsongor.github.io/time-travel-in-haskell-for-dummies/#a-single-pass-assembler-an-example) here. | ||||
| 
 | ||||
| Alternatively, here's an example from my research. I'll be fairly | ||||
| vague, since all of this is still in progress. The gist is that | ||||
| we have some kind of data structure (say, a list or a tree), | ||||
| and we want to associate with each element in this data | ||||
| structure a 'score' of how useful it is. There are many possible | ||||
| heuristics of picking 'scores'; a very simple one is  | ||||
| to make it inversely propertional to the number of times | ||||
| an element occurs. To be more concrete, suppose | ||||
| we have some element type `Element`: | ||||
| 
 | ||||
| {{< codelines "Haskell" "time-traveling/ValueScore.hs" 5 6 >}} | ||||
| 
 | ||||
| Suppose also that our data structure is a binary tree: | ||||
| 
 | ||||
| {{< codelines "Haskell" "time-traveling/ValueScore.hs" 14 16 >}} | ||||
| 
 | ||||
| We then want to transform an input `ElementTree`, such as: | ||||
| 
 | ||||
| ```Haskell | ||||
| Node A (Node A Empty Empty) Empty | ||||
| ``` | ||||
| 
 | ||||
| Into a scored tree, like: | ||||
| 
 | ||||
| ```Haskell | ||||
| Node (A,0.5) (Node (A,0.5) Empty Empty) Empty | ||||
| ``` | ||||
| 
 | ||||
| Since `A` occured twice, its score is `1/2 = 0.5`.  | ||||
| 
 | ||||
| Let's define some utility functions before we get to the | ||||
| meat of the implementation: | ||||
| 
 | ||||
| {{< codelines "Haskell" "time-traveling/ValueScore.hs" 8 12 >}} | ||||
| 
 | ||||
| The `addElement` function simply increments the counter for a particular | ||||
| element in the map, adding the number `1` if it doesn't exist. The `getScore` | ||||
| function computes the score of a particular element, defaulting to `1.0` if | ||||
| it's not found in the map. | ||||
| 
 | ||||
| Just as before -- noticing that passing around the future values is getting awfully | ||||
| bothersome -- we write our scoring function as though we have | ||||
| a 'future value'. | ||||
| 
 | ||||
| {{< codelines "Haskell" "time-traveling/ValueScore.hs" 18 24 >}} | ||||
| 
 | ||||
| The actual `doAssignScores` function is pretty much identical to | ||||
| `doRepMax`: | ||||
| 
 | ||||
| {{< codelines "Haskell" "time-traveling/ValueScore.hs" 26 28 >}} | ||||
| 
 | ||||
| There's quite a bit of repetition here, especially in the handling | ||||
| of future values - all of our functions now accept an extra | ||||
| future argument, and return a work-in-progress future value. | ||||
| This is what the `Tardis` monad, and its corresponding | ||||
| `TardisT` monad transformer, aim to address. Just like the | ||||
| `State` monad helps us avoid writing plumbing code for | ||||
| forward-traveling values, `Tardis` helps us do the same | ||||
| for backward-traveling ones. | ||||
| 
 | ||||
| #### Cycles in Monadic Bind | ||||
| 
 | ||||
| We've seen that we're able to write code like the following: | ||||
| 
 | ||||
| ```Haskell | ||||
| (a, b) = f a c | ||||
| ``` | ||||
| 
 | ||||
| That is, we were able to write function calls that referenced | ||||
| their own return values. What if we try doing this inside | ||||
| a `do` block? Say, for example, we want to sprinkle some time | ||||
| traveling into our program, but don't want to add a whole new | ||||
| transformer into our monad stack. We could write code as follows: | ||||
| 
 | ||||
| ```Haskell | ||||
| do | ||||
|     (a, b) <- f a c | ||||
|     return b | ||||
| ``` | ||||
| 
 | ||||
| Unfortunately, this doesn't work. However, it's entirely | ||||
| possible to enable this using the `RecursiveDo` language | ||||
| extension: | ||||
| 
 | ||||
| ```Haskell | ||||
| {-# LANGUAGE RecursiveDo #-} | ||||
| ``` | ||||
| 
 | ||||
| Then, we can write the above as follows: | ||||
| 
 | ||||
| ```Haskell | ||||
| do | ||||
|     rec (a, b) <- f a c | ||||
|     return b | ||||
| ```  | ||||
| 
 | ||||
| This power, however, comes at a price. It's not as straightforward | ||||
| to build graphs from recursive monadic computations; in fact, | ||||
| it's not possible in general. The translation of the above | ||||
| code uses `MonadFix`. A monad that satisfies `MonadFix` has | ||||
| an operation `mfix`, which is the monadic version of the `fix` | ||||
| function we saw earlier: | ||||
| 
 | ||||
| ```Haskell | ||||
| mfix :: Monad m => (a -> m a) -> m a | ||||
| -- Regular fix, for comparison | ||||
| fix :: (a -> a) -> a | ||||
| ``` | ||||
| 
 | ||||
| To really understand how the translation works, check out the | ||||
| [paper on recursive do notation](http://leventerkok.github.io/papers/recdo.pdf). | ||||
| 
 | ||||
| ### Beware The Strictness | ||||
| Though Csongor points out other problems with the | ||||
| time traveling approach, I think he doesn't mention | ||||
| an important idea: you have to be _very_ careful about introducing | ||||
| strictness into your programs when running time-traveling code. | ||||
| For example, suppose we wanted to write a function, | ||||
| `takeUntilMax`, which would return the input list, | ||||
| cut off after the first occurence of the maximum number. | ||||
| Following the same strategy, we come up with: | ||||
| 
 | ||||
| {{< todo >}}This whole section, too. {{< /todo >}} | ||||
| {{< codelines "Haskell" "time-traveling/TakeMax.hs" 1 12 >}} | ||||
| 
 | ||||
| ### Leftovers | ||||
| In short, if we encounter our maximum number, we just return | ||||
| a list of that maximum number, since we do not want to recurse | ||||
| further. On the other hand, if we encounter a number that's | ||||
| _not_ the maximum, we continue our recursion. | ||||
| 
 | ||||
| This is | ||||
| what allows us to write the code above: the graph of `repMax xs largest` | ||||
| effectively refers to itself. While traversing the list, it places references | ||||
| to itself in place of each of the elements, and thanks to laziness, these | ||||
| references are not evaluated. | ||||
| Unfortunately, this doesn't work; our program never terminates. | ||||
| You may be thinking: | ||||
| 
 | ||||
| Let's try a more complicated example. How about instead of creating a new list, | ||||
| we return a `Map` containing the number of times each number occured, but only | ||||
| when those numbers were a factor of the maximum numbers. Our expected output | ||||
| will be: | ||||
| > Well, obviously this doesn't work! We didn't actually | ||||
| compute the maximum number properly, since we stopped | ||||
| recursing too early. We need to traverse the whole list, | ||||
| and not just the part before the maximum number. | ||||
| 
 | ||||
| ``` | ||||
| >>> countMaxFactors [1,3,3,9] | ||||
| To address this, we can reformulate our `takeUntilMax` | ||||
| function as follows: | ||||
| 
 | ||||
| fromList [(1, 1), (3, 2), (9, 1)] | ||||
| ``` | ||||
| {{< codelines "Haskell" "time-traveling/TakeMax.hs" 14 21 >}} | ||||
| 
 | ||||
| Now we definitely compute the maximum correctly! Alas, | ||||
| this doesn't work either. The issue lies on lines 5 and 18, | ||||
| more specifically in the comparison `x == m`. Here, we  | ||||
| are trying to base the decision of what branch to take | ||||
| on a future value. This is simply impossible; to compute | ||||
| the value, we need to know the value! | ||||
| 
 | ||||
| This is no 'silly mistake', either! In complicated programs | ||||
| that use time traveling, strictness lurks behind every corner. | ||||
| In my research work, I was at one point inserting a data structure into | ||||
| a set; however, deep in the structure was a data type containing | ||||
| a 'future' value, and using the default `Eq` instance! | ||||
| Adding the data structure to a set ended up invoking `(==)` (or perhaps | ||||
| some function from the `Ord` typeclass), | ||||
| which, in turn, tried to compare the lazily evaluated values. | ||||
| My code therefore didn't terminate, much like `takeUntilMax`. | ||||
| 
 | ||||
| Debugging time traveling code is, in general, | ||||
| a pain. This is especially true since future values don't look any different | ||||
| from regular values. You can see it in the type signatures | ||||
| of `repMax` and `takeUntilMax`: the maximum number is just an `Int`! | ||||
| And yet, trying to see what its value is will kill the entire program. | ||||
| As always, remember Brian W. Kernighan's wise words: | ||||
| 
 | ||||
| > Debugging is twice as hard as writing the code in the first place. | ||||
| Therefore, if you write the code as cleverly as possible, you are, | ||||
| by definition, not smart enough to debug it. | ||||
| 
 | ||||
| ### Conclusion | ||||
| This is about it! In a way, time traveling can make code performing | ||||
| certain operations more expressive. Furthermore, even if it's not groundbreaking, | ||||
| thinking about time traveling is a good exercise to get familiar | ||||
| with lazy evaluation in general. I hope you found this useful! | ||||
|  | ||||
| Before Width: | Height: | Size: 74 KiB After Width: | Height: | Size: 72 KiB | 
| Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |