From 765d497724ef78341465e6ef267b71aaab19276c Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Wed, 1 Jan 2020 11:02:13 -0800 Subject: [PATCH] Address missing problem and make some other improvements in CS325HW2 --- content/blog/01_cs325_languages_hw2.md | 83 +++++++++++++++++++++++--- 1 file changed, 74 insertions(+), 9 deletions(-) diff --git a/content/blog/01_cs325_languages_hw2.md b/content/blog/01_cs325_languages_hw2.md index c5f7d22..a55feba 100644 --- a/content/blog/01_cs325_languages_hw2.md +++ b/content/blog/01_cs325_languages_hw2.md @@ -25,7 +25,7 @@ using a slightly-modified `mergesort`__. The trick is to maintain a counter of inversions in every recursive call to `mergesort`, updating it every time we take an element from the {{< sidenote "right" "right-note" "right list" >}} -If this nomeclature is not clear to you, recall that +If this nomenclature is not clear to you, recall that mergesort divides a list into two smaller lists. The "right list" refers to the second of the two, because if you visualize the original list as a rectangle, and cut @@ -72,13 +72,19 @@ Again, let's start by visualizing what the solution will look like. How about th We divide the code into the same three steps that we described above. The first section is the initial state. Since it doesn't depend on anything, we expect it to be some kind of literal, like an integer. Next, we have the effect section, -which has access to variables such as "STATE" (to access the current state) -and "LEFT" (to access the left list), or "L" to access the "name" of the left list. -We use an `if`-statement to check if the origin of the element that was popped -(held in the "SOURCE" variable) is the right list (denoted by "R"). If it is, -we increment the counter (state) by the proper amount. In the combine step, we simply increment -the state by the counters from the left and right solutions, stored in "LSTATE" and "RSTATE". -That's it! +which has access to the variables below: + +* `STATE`, to manipulate or check the current state. +* `LEFT` and `RIGHT`, to access the two lists being merged. +* `L` and `R`, constants that are used to compare against the `SOURCE` variable. +* `SOURCE`, to denote which list a number came from. +* `LSTATE` and `RSTATE`, to denote the final states from the two subproblems. + +We use an `if`-statement to check if the element that was popped came +from the right list (by checking `SOURCE == R`). If it is, we increment the counter +(state) by the proper amount. In the combine step, which has access to the +same variables, we simply increment the state by the counters from the left +and right solutions, stored in `LSTATE` and `RSTATE`. That's it! #### Implementation The implementation is not tricky at all. We don't need to use monads like we did last @@ -89,7 +95,7 @@ uppercase "global" variables to lowercase. We'll do it like so: {{< codelines "Haskell" "cs325-langs/src/LanguageTwo.hs" 167 176 >}} -Note that we translated "L" and "R" to integer literals. We'll indicate the source of +Note that we translated `L` and `R` to integer literals. We'll indicate the source of each element with an integer, since there's no real point to representing it with a string or a variable. We'll need to be aware of this when we implement the actual, generic mergesort code. Let's do that now: @@ -151,3 +157,62 @@ we have to do is not specify any additional behavior. Cool, huh? That's the end of this post. If you liked this one (and the previous one!), keep an eye out for more! + +### Appendix (Missing Homework Question) +I should not view homework assignments on a small-screen device. There __was__ a third problem +on homework 2: + +{{< codelines "text" "cs325-langs/hws/hw2.txt" 46 65 >}} + +This is not a mergesort variant, and adding support for it into our second language +will prevent us from making it the neat specialized +{{< sidenote "right" "dsl-note" "DSL" >}} +DSL is a shortened form of "domain specific language", which was briefly +described in another sidenote while solving homework 1. +{{< /sidenote >}} that was just saw. We'll do something else, instead: +we'll use the language we defined in homework 1 to solve this +problem: + +``` +empty() = [0, 0]; +longest(xs) = + if |xs| != 0 + then _longest(longest(xs[0]), longest(xs[2])) + else empty(); +_longest(l, r) = [max(l[0], r[0]) + 1, max(l[0]+r[0], max(l[1], r[1]))]; +``` + +{{< sidenote "right" "terrible-note" "This is quite terrible." >}} +This is probably true with any program written in our first +language. +{{< /sidenote >}} In these 6 lines of code, there are two hacks +to work around the peculiarities of the language. + +At each recursive call, we want to keep track of both the depth +of the tree and the existing longest path. This is because +the longest path could be found either somewhere down +a subtree, or from combining the largest depths of +two subtrees. To return two values from a function in Python, +we'd use a tuple. Here, we use a list. + +Alarm bells should be going off here. There's no reason why we should +ever return an empty list from the recursive call: at the very least, we +want to return `[0,0]`. But placing such a list literal in a function +will trigger the special case insertion. So, we have to hide this literal +from the compiler. Fortunately, that's not too hard to do - the compiler +is pretty halfhearted in its inference of types. Simply putting +the literal behind a constant function (`empty`) does the trick. + +The program uses the subproblem depths multiple times in the +final computation. We thus probably want to assign these values +to names so we don't have to perform any repeated work. Since +the only two mechanisms for +{{< sidenote "right" "binding-note" "binding variables" >}} +To bind a variable means to assign a value to it. +{{< /sidenote >}} in this language are function calls +and list selectors, we use a helper function `_longest`, +which takes two subproblem solutions an combines them +into a new solution. It's pretty obvious that `_longest` +returns a list, so the compiler will try insert a base +case. Fortunately, subproblem solutions are always +lists of two numbers, so this doesn't affect us too much.