Update lazy evaluation post with images and more.
Before Width: | Height: | Size: 28 KiB After Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 53 KiB After Width: | Height: | Size: 48 KiB |
|
@ -7,6 +7,7 @@ draft: true
|
|||
|
||||
<style>
|
||||
img, figure.small img { max-height: 20rem; }
|
||||
figure.tiny img { max-height: 15rem; }
|
||||
figure.medium img { max-height: 30rem; }
|
||||
</style>
|
||||
|
||||
|
@ -15,8 +16,7 @@ I recently got to use a very curious Haskell technique
|
|||
As production as research code gets, anyway!
|
||||
{{< /sidenote >}} time traveling. I say this with
|
||||
the utmost seriousness. This technique worked like
|
||||
magic for the problem I was trying to solve (which isn't
|
||||
interesting enough to be presented here in itself), and so
|
||||
magic for the problem I was trying to solve, and so
|
||||
I thought I'd share what I learned. In addition
|
||||
to the technique and its workings, I will also explain how
|
||||
time traveling can be misused, yielding computations that
|
||||
|
@ -74,7 +74,7 @@ value even come from?
|
|||
|
||||
Thus far, nothing too magical has happened. It's a little
|
||||
strange to expect the result of the computation to be
|
||||
given to us; however, thus far, it looks like wishful
|
||||
given to us; it just looks like wishful
|
||||
thinking. The real magic happens in Csongor's `doRepMax`
|
||||
function:
|
||||
|
||||
|
@ -100,8 +100,9 @@ Why is it called graph reduction, you may be wondering, if the runtime is
|
|||
manipulating syntax trees? To save on work, if a program refers to the
|
||||
same value twice, Haskell has both of those references point to the
|
||||
exact same graph. This violates the tree's property of having only one path
|
||||
from the root to any node, and makes our program a graph. Graphs that
|
||||
refer to themselves also violate the properties of a tree.
|
||||
from the root to any node, and makes our program a DAG (at least). Graph nodes that
|
||||
refer to themselves (which are also possible in the model) also violate the properties of a
|
||||
a DAG, and thus, in general, we are working with graphs.
|
||||
{{< /sidenote >}} performing
|
||||
substitutions and simplifications as necessary until it reaches a final answer.
|
||||
What the lazy part means is that parts of the syntax tree that are not yet
|
||||
|
@ -184,7 +185,7 @@ we end up with the following:
|
|||
|
||||
{{< figure src="square_2.png" caption="The graph of `let x = square 5 in x + x` after `square 5` is reduced." >}}
|
||||
|
||||
There are two `25`s in the tree, and no more `square`s! We only
|
||||
There are two `25`s in the graph, and no more `square`s! We only
|
||||
had to evaluate `square 5` exactly once, even though `(+)`
|
||||
will use it twice (once for the left argument, and once for the right).
|
||||
|
||||
|
@ -207,7 +208,7 @@ fix f = let x = f x in x
|
|||
See how the definition of `x` refers to itself? This is what
|
||||
it looks like in graph form:
|
||||
|
||||
{{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." >}}
|
||||
{{< figure src="fixpoint_1.png" caption="The initial graph of `let x = f x in x`." class="tiny" >}}
|
||||
|
||||
I think it's useful to take a look at how this graph is processed. Let's
|
||||
pick `f = (1:)`. That is, `f` is a function that takes a list,
|
||||
|
@ -221,7 +222,8 @@ constant `1`, and then to `f`'s argument (`x`, in this case). As
|
|||
before, once we evaluated `f x`, we replaced the application with
|
||||
an indirection; in the image, this indirection is the top box. But the
|
||||
argument, `x`, is itself an indirection which points to the root of `f x`,
|
||||
thereby creating a cycle in our graph.
|
||||
thereby creating a cycle in our graph. Traversing this graph looks like
|
||||
traversing an infinite list of `1`s.
|
||||
|
||||
Almost there! A node can refer to itself, and, when evaluated, it
|
||||
is replaced with its own value. Thus, a node can effectively reference
|
||||
|
@ -259,18 +261,16 @@ Now, let's write the initial graph for `doRepMax [1,2]`:
|
|||
{{< figure src="repmax_1.png" caption="The initial graph of `doRepMax [1,2]`." >}}
|
||||
|
||||
Other than our new notation, there's nothing too surprising here.
|
||||
At a high level, all we want is the second element of the tuple
|
||||
The first step of our hypothetical reduction would replace the application of `doRepMax` with its
|
||||
body, and create our graph's first cycle. At a high level, all we want is the second element of the tuple
|
||||
returned by `repMax`, which contains the output list. To get
|
||||
the tuple, we apply `repMax` to the list `[1,2]`, which itself
|
||||
the tuple, we apply `repMax` to the list `[1,2]` and the first element
|
||||
of its result. The list `[1,2]` itself
|
||||
consists of two uses of the `(:)` function.
|
||||
|
||||
The first step
|
||||
of our hypothetical reduction would replace the application of `doRepMax` with its
|
||||
body, and create our graph's first cycle:
|
||||
|
||||
{{< figure src="repmax_2.png" caption="The first step of reducing `doRepMax [1,2]`." >}}
|
||||
|
||||
Next, we would do the same for the body of `repMax`. In
|
||||
Next, we would also expand the body of `repMax`. In
|
||||
the following diagram, to avoid drawing a noisy amount of
|
||||
crossing lines, I marked the application of `fst` with
|
||||
a star, and replaced the two edges to `fst` with
|
||||
|
@ -362,7 +362,7 @@ element of the tuple, and replace `snd` with an indirection to it:
|
|||
|
||||
The second element of the tuple was a call to `(:)`, and that's what the mysterious
|
||||
force is processing now. Just like it did before, it starts by looking at the
|
||||
first argument of this list, which is head. This argument is a reference to
|
||||
first argument of this list, which is the list's head. This argument is a reference to
|
||||
the starred node, which, as we've established, eventually points to `2`.
|
||||
Another `2` pops up on the console.
|
||||
|
||||
|
@ -374,32 +374,197 @@ After removing the unused nodes, we are left with the following graph:
|
|||
|
||||
{{< figure src="repmax_10.png" caption="The result of reducing `doRepMax [1,2]`." >}}
|
||||
|
||||
As we would have expected, two `2`s are printed to the console.
|
||||
As we would have expected, two `2`s were printed to the console, and our
|
||||
final graph represents the list `[2,2]`.
|
||||
|
||||
### Using Time Traveling
|
||||
Is time tarveling even useful? I would argue yes, especially
|
||||
in cases where Haskell's purity can make certain things
|
||||
difficult.
|
||||
|
||||
{{< todo >}}This whole section {{< /todo >}}
|
||||
As a first example, Csongor provides an assembler that works
|
||||
in a single pass. The challenge in this case is to resolve
|
||||
jumps to code segments occuring _after_ the jump itself;
|
||||
in essence, the address of the target code segment needs to be
|
||||
known before the segment itself is processed. Csongor's
|
||||
code uses the [Tardis monad](https://hackage.haskell.org/package/tardis-0.4.1.0/docs/Control-Monad-Tardis.html),
|
||||
which combines regular state, to which you can write and then
|
||||
later read from, and future state, from which you can
|
||||
read values before your write them. Check out
|
||||
[his complete example](https://kcsongor.github.io/time-travel-in-haskell-for-dummies/#a-single-pass-assembler-an-example) here.
|
||||
|
||||
Alternatively, here's an example from my research. I'll be fairly
|
||||
vague, since all of this is still in progress. The gist is that
|
||||
we have some kind of data structure (say, a list or a tree),
|
||||
and we want to associate with each element in this data
|
||||
structure a 'score' of how useful it is. There are many possible
|
||||
heuristics of picking 'scores'; a very simple one is
|
||||
to make it inversely propertional to the number of times
|
||||
an element occurs. To be more concrete, suppose
|
||||
we have some element type `Element`:
|
||||
|
||||
{{< codelines "Haskell" "time-traveling/ValueScore.hs" 5 6 >}}
|
||||
|
||||
Suppose also that our data structure is a binary tree:
|
||||
|
||||
{{< codelines "Haskell" "time-traveling/ValueScore.hs" 14 16 >}}
|
||||
|
||||
We then want to transform an input `ElementTree`, such as:
|
||||
|
||||
```Haskell
|
||||
Node A (Node A Empty Empty) Empty
|
||||
```
|
||||
|
||||
Into a scored tree, like:
|
||||
|
||||
```Haskell
|
||||
Node (A,0.5) (Node (A,0.5) Empty Empty) Empty
|
||||
```
|
||||
|
||||
Since `A` occured twice, its score is `1/2 = 0.5`.
|
||||
|
||||
Let's define some utility functions before we get to the
|
||||
meat of the implementation:
|
||||
|
||||
{{< codelines "Haskell" "time-traveling/ValueScore.hs" 8 12 >}}
|
||||
|
||||
The `addElement` function simply increments the counter for a particular
|
||||
element in the map, adding the number `1` if it doesn't exist. The `getScore`
|
||||
function computes the score of a particular element, defaulting to `1.0` if
|
||||
it's not found in the map.
|
||||
|
||||
Just as before -- noticing that passing around the future values is getting awfully
|
||||
bothersome -- we write our scoring function as though we have
|
||||
a 'future value'.
|
||||
|
||||
{{< codelines "Haskell" "time-traveling/ValueScore.hs" 18 24 >}}
|
||||
|
||||
The actual `doAssignScores` function is pretty much identical to
|
||||
`doRepMax`:
|
||||
|
||||
{{< codelines "Haskell" "time-traveling/ValueScore.hs" 26 28 >}}
|
||||
|
||||
There's quite a bit of repetition here, especially in the handling
|
||||
of future values - all of our functions now accept an extra
|
||||
future argument, and return a work-in-progress future value.
|
||||
This is what the `Tardis` monad, and its corresponding
|
||||
`TardisT` monad transformer, aim to address. Just like the
|
||||
`State` monad helps us avoid writing plumbing code for
|
||||
forward-traveling values, `Tardis` helps us do the same
|
||||
for backward-traveling ones.
|
||||
|
||||
#### Cycles in Monadic Bind
|
||||
|
||||
We've seen that we're able to write code like the following:
|
||||
|
||||
```Haskell
|
||||
(a, b) = f a c
|
||||
```
|
||||
|
||||
That is, we were able to write function calls that referenced
|
||||
their own return values. What if we try doing this inside
|
||||
a `do` block? Say, for example, we want to sprinkle some time
|
||||
traveling into our program, but don't want to add a whole new
|
||||
transformer into our monad stack. We could write code as follows:
|
||||
|
||||
```Haskell
|
||||
do
|
||||
(a, b) <- f a c
|
||||
return b
|
||||
```
|
||||
|
||||
Unfortunately, this doesn't work. However, it's entirely
|
||||
possible to enable this using the `RecursiveDo` language
|
||||
extension:
|
||||
|
||||
```Haskell
|
||||
{-# LANGUAGE RecursiveDo #-}
|
||||
```
|
||||
|
||||
Then, we can write the above as follows:
|
||||
|
||||
```Haskell
|
||||
do
|
||||
rec (a, b) <- f a c
|
||||
return b
|
||||
```
|
||||
|
||||
This power, however, comes at a price. It's not as straightforward
|
||||
to build graphs from recursive monadic computations; in fact,
|
||||
it's not possible in general. The translation of the above
|
||||
code uses `MonadFix`. A monad that satisfies `MonadFix` has
|
||||
an operation `mfix`, which is the monadic version of the `fix`
|
||||
function we saw earlier:
|
||||
|
||||
```Haskell
|
||||
mfix :: Monad m => (a -> m a) -> m a
|
||||
-- Regular fix, for comparison
|
||||
fix :: (a -> a) -> a
|
||||
```
|
||||
|
||||
To really understand how the translation works, check out the
|
||||
[paper on recursive do notation](http://leventerkok.github.io/papers/recdo.pdf).
|
||||
|
||||
### Beware The Strictness
|
||||
Though Csongor points out other problems with the
|
||||
time traveling approach, I think he doesn't mention
|
||||
an important idea: you have to be _very_ careful about introducing
|
||||
strictness into your programs when running time-traveling code.
|
||||
For example, suppose we wanted to write a function,
|
||||
`takeUntilMax`, which would return the input list,
|
||||
cut off after the first occurence of the maximum number.
|
||||
Following the same strategy, we come up with:
|
||||
|
||||
{{< todo >}}This whole section, too. {{< /todo >}}
|
||||
{{< codelines "Haskell" "time-traveling/TakeMax.hs" 1 12 >}}
|
||||
|
||||
### Leftovers
|
||||
In short, if we encounter our maximum number, we just return
|
||||
a list of that maximum number, since we do not want to recurse
|
||||
further. On the other hand, if we encounter a number that's
|
||||
_not_ the maximum, we continue our recursion.
|
||||
|
||||
This is
|
||||
what allows us to write the code above: the graph of `repMax xs largest`
|
||||
effectively refers to itself. While traversing the list, it places references
|
||||
to itself in place of each of the elements, and thanks to laziness, these
|
||||
references are not evaluated.
|
||||
Unfortunately, this doesn't work; our program never terminates.
|
||||
You may be thinking:
|
||||
|
||||
Let's try a more complicated example. How about instead of creating a new list,
|
||||
we return a `Map` containing the number of times each number occured, but only
|
||||
when those numbers were a factor of the maximum numbers. Our expected output
|
||||
will be:
|
||||
> Well, obviously this doesn't work! We didn't actually
|
||||
compute the maximum number properly, since we stopped
|
||||
recursing too early. We need to traverse the whole list,
|
||||
and not just the part before the maximum number.
|
||||
|
||||
```
|
||||
>>> countMaxFactors [1,3,3,9]
|
||||
To address this, we can reformulate our `takeUntilMax`
|
||||
function as follows:
|
||||
|
||||
fromList [(1, 1), (3, 2), (9, 1)]
|
||||
```
|
||||
{{< codelines "Haskell" "time-traveling/TakeMax.hs" 14 21 >}}
|
||||
|
||||
Now we definitely compute the maximum correctly! Alas,
|
||||
this doesn't work either. The issue lies on lines 5 and 18,
|
||||
more specifically in the comparison `x == m`. Here, we
|
||||
are trying to base the decision of what branch to take
|
||||
on a future value. This is simply impossible; to compute
|
||||
the value, we need to know the value!
|
||||
|
||||
This is no 'silly mistake', either! In complicated programs
|
||||
that use time traveling, strictness lurks behind every corner.
|
||||
In my research work, I was at one point inserting a data structure into
|
||||
a set; however, deep in the structure was a data type containing
|
||||
a 'future' value, and using the default `Eq` instance!
|
||||
Adding the data structure to a set ended up invoking `(==)` (or perhaps
|
||||
some function from the `Ord` typeclass),
|
||||
which, in turn, tried to compare the lazily evaluated values.
|
||||
My code therefore didn't terminate, much like `takeUntilMax`.
|
||||
|
||||
Debugging time traveling code is, in general,
|
||||
a pain. This is especially true since future values don't look any different
|
||||
from regular values. You can see it in the type signatures
|
||||
of `repMax` and `takeUntilMax`: the maximum number is just an `Int`!
|
||||
And yet, trying to see what its value is will kill the entire program.
|
||||
As always, remember Brian W. Kernighan's wise words:
|
||||
|
||||
> Debugging is twice as hard as writing the code in the first place.
|
||||
Therefore, if you write the code as cleverly as possible, you are,
|
||||
by definition, not smart enough to debug it.
|
||||
|
||||
### Conclusion
|
||||
This is about it! In a way, time traveling can make code performing
|
||||
certain operations more expressive. Furthermore, even if it's not groundbreaking,
|
||||
thinking about time traveling is a good exercise to get familiar
|
||||
with lazy evaluation in general. I hope you found this useful!
|
||||
|
|
Before Width: | Height: | Size: 74 KiB After Width: | Height: | Size: 72 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |