96 lines
3.9 KiB
Markdown
96 lines
3.9 KiB
Markdown
|
---
|
||
|
title: "Clairvoyance for Good: Using Lazy Evaluation in Haskell"
|
||
|
date: 2020-05-03T20:05:29-07:00
|
||
|
tags: ["Haskell"]
|
||
|
draft: true
|
||
|
---
|
||
|
|
||
|
While tackling a project for work, I ran across a rather unpleasant problem.
|
||
|
I don't think it's valuable to go into the specifics here (it's rather
|
||
|
large and convoluted); however, the outcome of this experience led me to
|
||
|
discover a very interesting technique for lazy functional languages,
|
||
|
and I want to share what I learned.
|
||
|
|
||
|
### Time Traveling
|
||
|
Some time ago, I read [this post](https://kcsongor.github.io/time-travel-in-haskell-for-dummies/) by Csongor Kiss about time traveling in Haskell. It's
|
||
|
really cool, and makes a lot of sense if you have wrapped your head around
|
||
|
lazy evaluation. I'm going to present my take on it here, but please check out
|
||
|
Csongor's original post if you are interested.
|
||
|
|
||
|
Say that you have a list of integers, like `[3,2,6]`. Next, suppose that
|
||
|
you want to find the maximum value in the list. You can implement such
|
||
|
behavior quite simply with pattern matching:
|
||
|
|
||
|
```Haskell
|
||
|
myMax :: [Int] -> Int
|
||
|
myMax [] = error "Being total sucks"
|
||
|
myMax (x:xs) = max x $ myMax xs
|
||
|
```
|
||
|
|
||
|
You could even get fancy with a `fold`:
|
||
|
|
||
|
```Haskell
|
||
|
myMax :: [Int] -> Int
|
||
|
myMax = foldr1 max
|
||
|
```
|
||
|
|
||
|
All is well, and this is rather elementary Haskell. But now let's look at
|
||
|
something that Csongor calls the `repMax` problem:
|
||
|
|
||
|
> Imagine you had a list, and you wanted to replace all the elements of the
|
||
|
> list with the largest element, by only passing the list once.
|
||
|
|
||
|
How can we possibly do this in one pass? First, we need to find the maximum
|
||
|
element, and only then can we have something to replace the other numbers
|
||
|
with! It turns out, though, that we can just expect to have the future
|
||
|
value, and all will be well. Csongor provides the following example:
|
||
|
|
||
|
```Haskell {linenos=table}
|
||
|
repMax :: [Int] -> Int -> (Int, [Int])
|
||
|
repMax [] rep = (rep, [])
|
||
|
repMax [x] rep = (x, [rep])
|
||
|
repMax (l : ls) rep = (m', rep : ls')
|
||
|
where (m, ls') = repMax ls rep
|
||
|
m' = max m l
|
||
|
|
||
|
doRepMax :: [Int] -> [Int]
|
||
|
doRepMax xs = xs'
|
||
|
where (largest, xs') = repMax xs largest
|
||
|
```
|
||
|
|
||
|
In the above snippet, `repMax` expects to receive the maximum value of
|
||
|
its input list. At the same time, it also computes that maximum value,
|
||
|
returning it and the newly created list. `doRepMax` is where the magic happens:
|
||
|
the `where` clauses receives the maximum number from `repMax`, while at the
|
||
|
same time using that maximum number to call `repMax`!
|
||
|
|
||
|
This works because Haskell's evaluation model is, effectively,
|
||
|
[lazy graph reduction](https://en.wikipedia.org/wiki/Graph_reduction). That is,
|
||
|
you can think of Haskell as manipulating your code as
|
||
|
{{< sidenote "right" "tree-note" "a syntax tree," >}}
|
||
|
Why is it called graph reduction, you may be wondering, if the runtime is
|
||
|
manipulating syntax trees? To save on work, if a program refers to the
|
||
|
same value twice, Haskell has both of those references point to the
|
||
|
exact same graph. This violates the tree's property of having only one path
|
||
|
from the root to any node, and makes our program a graph. Graphs that
|
||
|
refer to themselves also violate the properties of a tree.
|
||
|
{{< /sidenote >}} performing
|
||
|
substitutions and simplifications as necessary until it reaches a final answer.
|
||
|
What the lazy part means is that parts of the syntax tree that are not yet
|
||
|
needed to compute the final answer can exist, unsimplied, in the tree. This is
|
||
|
what allows us to write the code above: the graph of `repMax xs largest`
|
||
|
effectively refers to itself. While traversing the list, it places references
|
||
|
to itself in place of each of the elements, and thanks to laziness, these
|
||
|
references are not evaluated.
|
||
|
|
||
|
Let's try a more complicated example. How about instead of creating a new list,
|
||
|
we return a `Map` containing the number of times each number occured, but only
|
||
|
when those numbers were a factor of the maximum numbers. Our expected output
|
||
|
will be:
|
||
|
|
||
|
```
|
||
|
>>> countMaxFactors [1,3,3,9]
|
||
|
|
||
|
fromList [(1, 1), (3, 2), (9, 1)]
|
||
|
```
|