Update lazy evaluation post with images and more.

2020-07-30 00:49:35 -07:00 · 2020-07-30 00:49:35 -07:00 · 58e6ad9e79
commit 58e6ad9e79
parent 3aa2a6783e
5 changed files with 198 additions and 33 deletions
--- a/content/blog/haskell_lazy_evaluation/fixpoint_1.png
+++ b/content/blog/haskell_lazy_evaluation/fixpoint_1.png
--- a/content/blog/haskell_lazy_evaluation/fixpoint_2.png
+++ b/content/blog/haskell_lazy_evaluation/fixpoint_2.png
--- a/content/blog/haskell_lazy_evaluation/index.md
+++ b/content/blog/haskell_lazy_evaluation/index.md
@ -7,6 +7,7 @@ draft: true
 <style>
 img, figure.small img { max-height: 20rem; }
 figure.tiny img { max-height: 15rem; }
 figure.medium img { max-height: 30rem; }
 </style>
@ -15,8 +16,7 @@ I recently got to use a very curious Haskell technique
 As production as research code gets, anyway!
 {{< /sidenote >}} time traveling. I say this with
 the utmost seriousness. This technique worked like
-magic for the problem I was trying to solve (which isn't
+magic for the problem I was trying to solve, and so
 interesting enough to be presented here in itself), and so
 I thought I'd share what I learned. In addition
 to the technique and its workings, I will also explain how 
 time traveling can be misused, yielding computations that
@ -74,7 +74,7 @@ value even come from?
 Thus far, nothing too magical has happened. It's a little
 strange to expect the result of the computation to be
-given to us; however, thus far, it looks like wishful
+given to us; it just looks like wishful
 thinking. The real magic happens in Csongor's `doRepMax`
 function:
@ -100,8 +100,9 @@ Why is it called graph reduction, you may be wondering, if the runtime is
 manipulating syntax trees? To save on work, if a program refers to the
 same value twice, Haskell has both of those references point to the
 exact same graph. This violates the tree's property of having only one path
-from the root to any node, and makes our program a graph. Graphs that
+from the root to any node, and makes our program a DAG (at least). Graph nodes that
-refer to themselves also violate the properties of a tree.
+refer to themselves (which are also possible in the model) also violate the properties of a
 a DAG, and thus, in general, we are working with graphs.
 {{< /sidenote >}} performing
 substitutions and simplifications as necessary until it reaches a final answer.
 What the lazy part means is that parts of the syntax tree that are not yet
@ -184,7 +185,7 @@ we end up with the following:
 {{< figure src="square_2.png" caption="The graph of `let x = square 5 in x + x` after `square 5` is reduced." >}}
-There are two `25`s in the tree, and no more `square`s! We only
+There are two `25`s in the graph, and no more `square`s! We only
 had to evaluate `square 5` exactly once, even though `(+)`
 will use it twice (once for the left argument, and once for the right).
@ -207,7 +208,7 @@ fix f = let x = f x in x
 See how the definition of `x` refers to itself? This is what
 it looks like in graph form:
-{{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." >}}
+{{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." class="tiny" >}}
 I think it's useful to take a look at how this graph is processed. Let's
 pick `f = (1:)`. That is, `f` is a function that takes a list,
@ -221,7 +222,8 @@ constant `1`, and then to `f`'s argument (`x`, in this case). As
 before, once we evaluated `f x`, we replaced the application with
 an indirection; in the image, this indirection is the top box. But the
 argument, `x`, is itself an indirection which points to the root of `f x`,
-thereby creating a cycle in our graph. 
+thereby creating a cycle in our graph. Traversing this graph looks like
 traversing an infinite list of `1`s.
 Almost there! A node can refer to itself, and, when evaluated, it
 is replaced with its own value. Thus, a node can effectively reference
@ -259,18 +261,16 @@ Now, let's write the initial graph for `doRepMax [1,2]`:
 {{< figure src="repmax_1.png" caption="The initial graph of `doRepMax [1,2]`." >}}
 Other than our new notation, there's nothing too surprising here.
-At a high level, all we want is the second element of the tuple
+The first step of our hypothetical reduction would replace the application of `doRepMax` with its
 body, and create our graph's first cycle. At a high level, all we want is the second element of the tuple
 returned by `repMax`, which contains the output list. To get
-the tuple, we apply `repMax` to the list `[1,2]`, which itself
+the tuple, we apply `repMax` to the list `[1,2]` and the first element
 of its result. The list `[1,2]` itself
 consists of two uses of the `(:)` function.
 The first step
 of our hypothetical reduction would replace the application of `doRepMax` with its
 body, and create our graph's first cycle:
 {{< figure src="repmax_2.png" caption="The first step of reducing `doRepMax [1,2]`." >}}
-Next, we would do the same for the body of `repMax`. In
+Next, we would also expand the body of `repMax`. In
 the following diagram, to avoid drawing a noisy amount of
 crossing lines, I marked the application of `fst` with
 a star, and replaced the two edges to `fst` with
@ -362,7 +362,7 @@ element of the tuple, and replace `snd` with an indirection to it:
 The second element of the tuple was a call to `(:)`, and that's what the mysterious
 force is processing now. Just like it did before, it starts by looking at the
-first argument of this list, which is head. This argument is a reference to
+first argument of this list, which is the list's head. This argument is a reference to
 the starred node, which, as we've established, eventually points to `2`.
 Another `2` pops up on the console.
@ -374,32 +374,197 @@ After removing the unused nodes, we are left with the following graph:
 {{< figure src="repmax_10.png" caption="The result of reducing `doRepMax [1,2]`." >}}
-As we would have expected, two `2`s are printed to the console.
+As we would have expected, two `2`s were printed to the console, and our
 final graph represents the list `[2,2]`.
 ### Using Time Traveling
 Is time tarveling even useful? I would argue yes, especially
 in cases where Haskell's purity can make certain things
 difficult.
-{{< todo >}}This whole section {{< /todo >}}
+As a first example, Csongor provides an assembler that works
 in a single pass. The challenge in this case is to resolve
 jumps to code segments occuring _after_ the jump itself;
 in essence, the address of the target code segment needs to be
 known before the segment itself is processed. Csongor's
 code uses the [Tardis monad](https://hackage.haskell.org/package/tardis-0.4.1.0/docs/Control-Monad-Tardis.html),
 which combines regular state, to which you can write and then
 later read from, and future state, from which you can
 read values before your write them. Check out
 [his complete example](https://kcsongor.github.io/time-travel-in-haskell-for-dummies/#a-single-pass-assembler-an-example) here.
 Alternatively, here's an example from my research. I'll be fairly
 vague, since all of this is still in progress. The gist is that
 we have some kind of data structure (say, a list or a tree),
 and we want to associate with each element in this data
 structure a 'score' of how useful it is. There are many possible
 heuristics of picking 'scores'; a very simple one is 
 to make it inversely propertional to the number of times
 an element occurs. To be more concrete, suppose
 we have some element type `Element`:
 {{< codelines "Haskell" "time-traveling/ValueScore.hs" 5 6 >}}
 Suppose also that our data structure is a binary tree:
 {{< codelines "Haskell" "time-traveling/ValueScore.hs" 14 16 >}}
 We then want to transform an input `ElementTree`, such as:
 ```Haskell
 Node A (Node A Empty Empty) Empty
 ```
 Into a scored tree, like:
 ```Haskell
 Node (A,0.5) (Node (A,0.5) Empty Empty) Empty
 ```
 Since `A` occured twice, its score is `1/2 = 0.5`. 
 Let's define some utility functions before we get to the
 meat of the implementation:
 {{< codelines "Haskell" "time-traveling/ValueScore.hs" 8 12 >}}
 The `addElement` function simply increments the counter for a particular
 element in the map, adding the number `1` if it doesn't exist. The `getScore`
 function computes the score of a particular element, defaulting to `1.0` if
 it's not found in the map.
 Just as before -- noticing that passing around the future values is getting awfully
 bothersome -- we write our scoring function as though we have
 a 'future value'.
 {{< codelines "Haskell" "time-traveling/ValueScore.hs" 18 24 >}}
 The actual `doAssignScores` function is pretty much identical to
 `doRepMax`:
 {{< codelines "Haskell" "time-traveling/ValueScore.hs" 26 28 >}}
 There's quite a bit of repetition here, especially in the handling
 of future values - all of our functions now accept an extra
 future argument, and return a work-in-progress future value.
 This is what the `Tardis` monad, and its corresponding
 `TardisT` monad transformer, aim to address. Just like the
 `State` monad helps us avoid writing plumbing code for
 forward-traveling values, `Tardis` helps us do the same
 for backward-traveling ones.
 #### Cycles in Monadic Bind
 We've seen that we're able to write code like the following:
 ```Haskell
 (a, b) = f a c
 ```
 That is, we were able to write function calls that referenced
 their own return values. What if we try doing this inside
 a `do` block? Say, for example, we want to sprinkle some time
 traveling into our program, but don't want to add a whole new
 transformer into our monad stack. We could write code as follows:
 ```Haskell
 do
    (a, b) <- f a c
    return b
 ```
 Unfortunately, this doesn't work. However, it's entirely
 possible to enable this using the `RecursiveDo` language
 extension:
 ```Haskell
 {-# LANGUAGE RecursiveDo #-}
 ```
 Then, we can write the above as follows:
 ```Haskell
 do
    rec (a, b) <- f a c
    return b
 ``` 
 This power, however, comes at a price. It's not as straightforward
 to build graphs from recursive monadic computations; in fact,
 it's not possible in general. The translation of the above
 code uses `MonadFix`. A monad that satisfies `MonadFix` has
 an operation `mfix`, which is the monadic version of the `fix`
 function we saw earlier:
 ```Haskell
 mfix :: Monad m => (a -> m a) -> m a
 -- Regular fix, for comparison
 fix :: (a -> a) -> a
 ```
 To really understand how the translation works, check out the
 [paper on recursive do notation](http://leventerkok.github.io/papers/recdo.pdf).
 ### Beware The Strictness
 Though Csongor points out other problems with the
 time traveling approach, I think he doesn't mention
 an important idea: you have to be _very_ careful about introducing
 strictness into your programs when running time-traveling code.
 For example, suppose we wanted to write a function,
 `takeUntilMax`, which would return the input list,
 cut off after the first occurence of the maximum number.
 Following the same strategy, we come up with:
-{{< todo >}}This whole section, too. {{< /todo >}}
+{{< codelines "Haskell" "time-traveling/TakeMax.hs" 1 12 >}}
-### Leftovers
+In short, if we encounter our maximum number, we just return
 a list of that maximum number, since we do not want to recurse
 further. On the other hand, if we encounter a number that's
 _not_ the maximum, we continue our recursion.
-This is
+Unfortunately, this doesn't work; our program never terminates.
-what allows us to write the code above: the graph of `repMax xs largest`
+You may be thinking:
 effectively refers to itself. While traversing the list, it places references
 to itself in place of each of the elements, and thanks to laziness, these
 references are not evaluated.
-Let's try a more complicated example. How about instead of creating a new list,
+> Well, obviously this doesn't work! We didn't actually
-we return a `Map` containing the number of times each number occured, but only
+compute the maximum number properly, since we stopped
-when those numbers were a factor of the maximum numbers. Our expected output
+recursing too early. We need to traverse the whole list,
-will be:
+and not just the part before the maximum number.
-```
+To address this, we can reformulate our `takeUntilMax`
->>> countMaxFactors [1,3,3,9]
+function as follows:
-fromList [(1, 1), (3, 2), (9, 1)]
+{{< codelines "Haskell" "time-traveling/TakeMax.hs" 14 21 >}}
 ```
 Now we definitely compute the maximum correctly! Alas,
 this doesn't work either. The issue lies on lines 5 and 18,
 more specifically in the comparison `x == m`. Here, we 
 are trying to base the decision of what branch to take
 on a future value. This is simply impossible; to compute
 the value, we need to know the value!
 This is no 'silly mistake', either! In complicated programs
 that use time traveling, strictness lurks behind every corner.
 In my research work, I was at one point inserting a data structure into
 a set; however, deep in the structure was a data type containing
 a 'future' value, and using the default `Eq` instance!
 Adding the data structure to a set ended up invoking `(==)` (or perhaps
 some function from the `Ord` typeclass),
 which, in turn, tried to compare the lazily evaluated values.
 My code therefore didn't terminate, much like `takeUntilMax`.
 Debugging time traveling code is, in general,
 a pain. This is especially true since future values don't look any different
 from regular values. You can see it in the type signatures
 of `repMax` and `takeUntilMax`: the maximum number is just an `Int`!
 And yet, trying to see what its value is will kill the entire program.
 As always, remember Brian W. Kernighan's wise words:
 > Debugging is twice as hard as writing the code in the first place.
 Therefore, if you write the code as cleverly as possible, you are,
 by definition, not smart enough to debug it.
 ### Conclusion
 This is about it! In a way, time traveling can make code performing
 certain operations more expressive. Furthermore, even if it's not groundbreaking,
 thinking about time traveling is a good exercise to get familiar
 with lazy evaluation in general. I hope you found this useful!
--- a/content/blog/haskell_lazy_evaluation/length_2.png
+++ b/content/blog/haskell_lazy_evaluation/length_2.png
--- a/content/blog/haskell_lazy_evaluation/square_2.png
+++ b/content/blog/haskell_lazy_evaluation/square_2.png