Finish third post in CS325 series.

This commit is contained in:
Danila Fedorin 2020-01-03 23:47:36 -08:00
parent a026e67a3b
commit 67181fb033
1 changed files with 155 additions and 21 deletions

View File

@ -2,7 +2,6 @@
title: A Language for an Assignment - Homework 3
date: 2020-01-02T22:17:43-08:00
tags: ["Haskell", "Python", "Algorithms"]
draft: true
---
It rained in Sunriver on New Year's Eve, and it continued to rain
@ -137,7 +136,7 @@ Like we discussed, it finds the `k`th closest element (calling it `min`),
and counts how many elements that are __equal__ need to be included,
by setting the number to `k` at first, and subtracting 1 for every number
it encounters that's closer than `min`. Notice that we use the `valid!` and
`step!` macros, which implement the opertions we described above. Notice
`step!` macros, which implement the operations we described above. Notice
that the user doesn't deal with adding and subtracting numbers, and doing
comparisons. All they have to do is ask "am I still good to iterate?"
@ -172,6 +171,10 @@ of the traverser declaration. Rather, every time that a comparison for a travers
operation is performed, this expression is re-evaluated. This allows us to put
dynamic bounds on traversers `y` and `z`, one of which must not exceed the other.
Note also a new keyword that was just used: `sorted`. This is a harmless little
language feature that automatically calls `.sort()` on the first argument of
the function.
This is more than enough to work with. Let's move on to the implementation.
#### Implementation
@ -189,7 +192,7 @@ We need, once again, to generate temporary variables. We also need to keep track
which variables are traversers, and the properties of these traversers, throughout
each function of the language. We thus fall back to using `Control.Monad.State`:
{{< todo >}}Code for Translator Monad{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 198 198 >}}
There's one part of the state tuple that we haven't yet explained: the list of
statements.
@ -209,34 +212,47 @@ concatenating them. When the program is ready to use the generated statements
(say, when an `if`-statement needs to use the statements emitted by the condition
expression), we retrieve them from the monad:
{{< todo >}}Code for getting statements{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 228 234 >}}
I should note, for transparency, that there's a bug in my use of this function.
When I compile `if`-statements, I accidentally place statements generated by
the condition into the body of the `if`. This bug doesn't manifest
in the solutions to the homework problems, and so I decided not to spend any more
time on fixing it.
##### Validating Traverser Declarations
We declare two separate types that hold traverser data. The first is a kind of "draft"
type, `TraverserData`. This record holds all possible configurations of a traverser
type, `TraverserData`:
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 184 190 >}}
This record holds all possible configurations of a traverser
that occur as the program is iterating through the various `key: value` pairs in
the declaration. For instance, at the very beginning of processing a traverser declaration,
our program will use a "default" `TraverserData`, with all fields set to `Nothing` or
their default value. This value will then be modified by the first key/value pair,
changing, for instance, the list that the traverser operates on. This new modified
`TraverserData` will then be modified by the next key/value pair, and so on. This
is, effectively, a fold operation.
`TraverserData` will then be modified by the next key/value pair, and so on. Doing
this with every key/value pair (called an option in the below snippet)
is effectively a foldl operation.
{{< todo >}}Code for TraverserData{{< /todo >}}
{{< todo >}}Maybe sidenote about fold?{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 378 387 >}}
The data may not have all the required fields until the very end, and its type
reflects that: `Maybe String` here, `Maybe TraverserBounds` there. We don't
want to deal with unwrapping the `Maybe a` values every time we use the traverser,
especially if we've done so before. So, we define a `ValidTraverserData` record,
especially if we've done so before. So, we define a `ValidTraverserData` record
that does not have `Maybe` arguments, and thus, has all the required data. At the
end of a traverser declaration, we attempt to translate a `TraverserData` into
a `ValidTraverserData`, invoking `fail` if we can't, and storing the `ValidTraverserData`
into the state otherwise. Then, every time we retrieve a traverser from the state,
it's guaranteed to be valid, and we have to spend no extra work unpacking it. We
into the state otherwise:
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 408 420 >}}
Then, every time we retrieve a traverser from the state,
define a lookup monadic operation like this:
{{< todo >}}Code for getting ValidTraverserData{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 240 244 >}}
##### Compiling Macros
I didn't call them macros for no reason. Clearly, we don't want to generate
@ -259,17 +275,17 @@ named like the traverser. We use the `requireTraverser` monadic operation
to get the traverser associated with the given variable name, and then perform
the operation as intended. The `at!(t)` operation is straightforward:
{{< todo >}}Code for at!{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 317 319 >}}
The `at!(t,i)` is less so, since it deals with the intricacies of accessing
the list at either a positive of negative offset, depending on the direction
of the traverser. We implement a function to properly generate an expression for the offset:
{{< todo >}}Code for traverserIncrement{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 246 249 >}}
We then implement `at!(t,i)` as follows:
{{< todo >}}Code for at!{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 320 323 >}}
The most complicated macro is `bisect!`. It must be able to step the traverser,
and also return a tuple of two lists that the bisection yields. We also
@ -283,13 +299,131 @@ we must do one of two things:
1. Translate 1-to-1, and create a lambda, passing it to a fixed `bisect` function declared
elsewhere.
2. Translate to a nested function declaration, inlining the lambda.
{{< todo >}}Maybe sidenote about inline?{{< /todo >}}
2. Translate to a nested function declaration,
{{< sidenote "right" "inline-note" "inlining the lambda." >}}
Inlining, in this case, means replacing a call to a function with the function's body.
We do this to prevent the overhead of calling a function, which typically involves pushing
on a stack and other extraneous work. If our function is simple, like a simple
comparison, it doesn't make sense to spend the effort calling it.
{{< /sidenote >}}
Since I quite like the idea of inlining a lambda, let's settle for that. To do this,
we pull a fresh temporary variable and declare a function, into which we place
the traverser iteration code, as well as the body of the lambda, with the variable
substituted for the list access expression. Here's the code:
substituted for the list access expression.
{{< sidenote "left" "nonlocal-note" "Here's the code:" >}}
Reading the lexical scope is one thing, but modifying it is another. To prevent
accidental changes to the variables outside a nested function, Python assumes
that variables assigned inside the function body are local to the function. Thus, to make
sure changing our variable (the traverser index) has an effect outside the function
(as it should) we must include the <code>nonlocal</code> keyword, telling
Python that we're not declaring a new, local variable, but mutating the old one.
{{< /sidenote >}}
{{< todo >}}Code for bisect!{{< /todo >}}
{{< codelines "Haskell" "cs325-langs/src/LanguageThree.hs" 342 363 >}}
### The Output
Let's see what the compiler spits out:
```Python
from bisect import bisect
import random
def qselect(xs,k,c):
if xs==[]:
return 0
bisector = 0
pivot = random.randrange(len(xs))
pivotE = xs.pop(pivot)
def temp1():
nonlocal bisector
l = []
r = []
while bisector<len(xs):
if c(xs[bisector])<c(pivotE):
l.append(xs[bisector])
else:
r.append(xs[bisector])
bisector = bisector+1
return (l, r)
(leftList,rightList) = temp1()
if k>len(leftList)+1:
return qselect(rightList, k-len(leftList)-1, c)
elif k==len(leftList)+1:
return pivotE
else:
return qselect(leftList, k, c)
def closestUnsorted(xs,k,n):
min = qselect(list(xs), k, (lambda x: abs(x-n)))
out = []
countEqual = k
iter = 0
while iter<len(xs):
if abs(xs[iter]-n)<abs(min-n):
countEqual = countEqual-1
iter = iter+1
0
iter = 0
while iter<len(xs):
if abs(xs[iter]-n)==abs(min-n) and countEqual>0:
countEqual = countEqual-1
out = out+[xs[iter]]
elif abs(xs[iter]-n)<abs(min-n):
out = out+[xs[iter]]
iter = iter+1
0
return out
def closestSorted(xs,k,n):
start = bisect(xs, n)
counter = 0
left = start
right = start
while counter!=k and left-1*1>=0 and right<len(xs):
if abs(xs[left-1*1]-n)<abs(xs[right]-n):
left = left-1
0
else:
right = right+1
0
counter = counter+1
while counter!=k and (left-1*1>=0 or right<len(xs)):
if left-1*1>=0:
left = left-1
0
else:
right = right+1
0
counter = counter+1
return xs[(left):(right)]
def xyz(xs,k):
xs.sort()
x = 0
dest = []
while x<len(xs):
z = x+2
y = x+1
while y<z and z<len(xs):
if xs[x]+xs[y]==xs[z]:
dest = dest+[(xs[x], xs[y], xs[z])]
z = z+1
0
elif xs[x]+xs[y]>xs[z]:
z = z+1
0
else:
y = y+1
0
x = x+1
0
return dest
```
Observe that the generated code just uses indices, `+`, `-`, and various comparison operators.
Our traverser is an example of a __zero cost abstraction__, a feature that, conceptually,
operates at a higher level, making us no longer worry about adding, subtracting, and
comparing numbers, while, in the final output, not damaging the performance of safety
of the code. Also observe the various `0` standalone statements. This is an issue
with the translator: traverser macros may not always yield an expression, but
the type of `translateExpr` and `translateStmt` effectively requires one. Thus,
when a macro doesn't generate anything useful, we give it the placeholder expression `0`.
That concludes this third post in the series. I hope to see you in the next one!