diff --git a/content/blog/stack_recursion.md b/content/blog/stack_recursion.md new file mode 100644 index 0000000..27f7335 --- /dev/null +++ b/content/blog/stack_recursion.md @@ -0,0 +1,342 @@ +--- +title: Creating Recursive Functions in a Stack Based Language +date: 2020-03-06T17:56:55-08:00 +tags: ["Programming Languages", "Haskell"] +draft: true +--- + +In CS 381, Programming Language Fundamentals, many students chose to +implement a stack based language. Such languages are very neat, +but two of the requirements for such languages may, at first, +seem somewhat hard to satisfy: + +> Recursion/loops, . . . [and] . . . Procedures/functions with arguments (or some other abstraction mechanism) + +A while-loop makes enough sense. The most straightforward way to implement such a loop +is to keep reading a boolean from the stack, and, if that boolean is true, running +some sequence of instructions. But while loops do not give you procedures - they +are not a sufficiently powerful abstraction mechanism for this assignment. So, we +turn to functions. + +The first instinct in implementing functions is to fall back to the tried-and-true +method of introducing more global state: we have a stack, but why don't we also +add a mapping from function names to their definitions (an environment)? This +works, but I feel like it goes somewhat against the whole idea of a stack-based +language. We can do everything we need to do, entirely on the stack! + +### A Toy Language +To make this post more concrete, let's define a small language. Small enough +that it's easy to reason about, but complex enough to support functions. I won't +be giving a Haskell-encoded abstract syntax definition - rather, let's work from +concrete syntax. How about something like: + +{{< latex >}} +\begin{aligned} +\textit{cmd} ::= \; & \text{Pop} \; n\\ + | \; & \text{Slide} \; n \\ + | \; & \text{Offset} \; n \\ + | \; & \text{Eq} \\ + | \; & \text{PushI} \; i \\ + | \; & \text{Add} \\ + | \; & \text{Mul} \\ + | \; & \textbf{if} \; \{ \textit{cmd}* \} \; \textbf{else} \; \{ \textit{cmd}* \} \\ + | \; & \textbf{func} \; \{ \textit{cmd}* \} \\ + \ \; & \textbf{Call} +\end{aligned} +{{< /latex >}} + +Let's informally define the meanings of each of the described commands: + +1. \\(\\text{Pop} \\; n\\): Removes the top \\(n\\) elements from the stack. +2. \\(\\text{Slide} \\; n \\): Removes the top \\(n\\) elements __after the first element on the stack__. +The first element is not removed. +2. \\(\\text{Offset} \\; n \\): Pushes an element from the stack onto the stack, again. When \\(n=0\\), +the top element is pushed, when \\(n=1\\), the second element is pushed, and so on. +3. \\(\\text{Eq}\\): Compares two numbers on top of the stack for equality. The numbers are removed, +and replaced with a boolean indicating whether or not they are equal. +4. \\(\\text{PushI} \\; i \\): Pushes an integer \\(i\\) onto the stack. +5. \\(\\text{Add}\\): Adds two numbers on top of the stack. The two numbers are removed, +and replaced with their sum. +6. \\(\\text{Mul}\\): Multiplies two numbers on top of the stack. The two numbers are removed, +and replaced with their product. +7. \\(\\textbf{if}\\)/\\(\\textbf{else}\\): Runs the first list of commands if the boolean "true" is +on top of the stack, and the second list of commands if the boolean is "false". +8. \\(\\textbf{func}\\): pushes a function with the given commands onto the stack. +9. \\(\\text{Call}\\): calls the function at the top of the stack. The function is removed, +and its body is then executed. + +Great! Let's now write some dummy programs in our language (and switch to code blocks +from LaTeX). How about a program that multiplies 4 and 5? + +``` +PushI 5 +PushI 4 +Mul +``` + +Next, let's try something more complicated. +{{< sidenote "right" "contrived-note" "How about a program that checks if 3 is equal to 4, and returns 999 if they are equal, and 1 if they are not?" >}} +I'm aware that this example is contrived. To minimize the cognitive load of working with our language, I've stripped it of many useful features, including +inequalities. This is why the example may seem strange: I had to pose a question I could answer! +{{< /sidenote >}} + +``` +PushI 4 +PushI 3 +Eq +if { PushI 999 } else { PushI 1 } +``` + +Now, it's time for the actual meat: can our language do recursion? +I claim that it does, but before we start hacking away, there's one more thing we need to do: +establish a calling convention. + +### Be Conventional! +Our language does not enforce any etiquette. You can easily create a function +that pops every value off the stack, continuing until the stack is empty. +You can equally easily make a function that fills your stack with random junk. +With such potential for disorder, a programmer --- maybe yourself --- may experience some +{{< sidenote "right" "anomie-note" "anomie." >}} +Anomie is defined as "lack of the usual social or ethical standards in an individual or group" according +to the Oxford dictionary. +{{< /sidenote >}} To deal with this, we try to maintain a little bit of order in the midst +of all the computational chaos. We will adopt calling conventions. + +When I say calling convention, I mean that every time we call a function, we do it in a +methodical way. There are many possible such methods, but I propose the following: + +1. Since \\(\\text{Call}\\) requires that the function you're calling is at the top +of the stack, we stick with that. +2. If the function expects arguments, we push them on the stack right before the function. The +first argument of the function should be second from the top of the stack (i.e., +{{< sidenote "right" "offset-note" "accessible from the function via \(\text{Offset} \; 0\))." >}} +Note that \(\text{Call}\) removes the function from the stack, which is why the first argument +ends up at the very top. +{{< /sidenote >}} The second argument should follow, then the third, and so on. +3. When a function returns, it should not leave its arguments on the stack. Instead of them, +the function should leave its resulting value. +4. A function does not modify the stack below the arguments it receives. + +Let's try this out with a basic function definition and call. How about a function that +always returns 0, no matter what argument you give it? The function itself +would look something like this: + +``` +PushI 0 +Slide 1 +``` + +Here's how things will play out. When the function is called --- and we assume +that it is called correctly, of course -- it will receive an integer +on top of the stack. That may not, and likely will not, be the only thing on the stack. +However, to stick by convention 4, we pretend that the stack is empty, and that +trying to manipulate it will result in an error. So, we can start by imagining +an empty stack, with an integer \\(x\\) on top: + +{{< todo >}}Stack with x on top{{< /todo >}} + +Then, \\(\\text{PushI} \\; 0\\) will push 0 onto the stack: + +{{< todo >}}Stack with x then 0{{< /todo >}} + +\\(\\text{Slide} \\; 1\\) will then remove the 1 element after the top element: \\(x\\). +We end up with the following stack: + +{{< todo >}}Stack with 0{{< /todo >}} + +The function has finished running, and we maintain convention 3: the function's +return value is in place of its argument on the stack. + +All that's left is to call this function. Let's try calling the function +with the number 15. We do this like so: + +``` +PushI 15 +func { PushI 0; Slide 1 } +Call +``` + +The function must be on top of the stack, as per the semantics of our language +(and, I suppose, convention 1). Because of this, we have to push it last. +It only takes one argument, which we push on the stack first (so that it ends up +below the function, as per convention 2). When both are pushed, we use +\\(\\text{Call}\\) to execute the function, which will proceed as we've seen above. + +### Get Ahold of Yourself! +How should a function call itself? The fact that functions reside on the stack, +and can therefore be manipulated in the same way as any stack elements. This +opens up an opportunity for us: we can pass the function as an argument +to itself! Then, when it needs to make a recursive call, all it must do +is \\(\\text{Offset}\\) itself onto the top of the stack, then \\(\\text{Call}\\), +and voila! + +Talk is great, of course, but talking doesn't give us any examples. Let's +walk through an example of writing a recursive function this way. Let's +try [factorial](https://en.wikipedia.org/wiki/Factorial)! + +The "easy" implementation of factorial is split into two cases: +the base case, when \\(0! = 1\\) is computed, and the recursive case, +in which we multiply the input number \\(n\\) by the result +of computing factorial for \\(n-1\\). Accordingly, we will use +the \\(\\textbf{if}\\)/\\(\\text{else}\\) command. We will +make our function take two arguments, with the number input +as the first ("top") argument, and the function itself as +the second argument. Importantly, we do not want to destroy the input +number by running \\(\\text{Eq}\\) directly on it. Instead, +we first copy it using \\(\\text{Offset} \\; 0\\), then +compare it to 0: + +``` +Offset 0 +PushI 0 +Eq +``` + +Let's walk through this. We start with only the arguments +on the stack: + +{{< todo >}}image of stack of factorial call{{< /todo >}} + +Then, \\(\\text{Offset} \\; 0\\) duplicates the first argument +(the number): + +{{< todo >}}image of stack of factorial call with number duped{{< /todo >}} + +Next, 0 is pushed onto the stack: + +{{< todo >}}image of stack of factorial call with number duped, and zero{{< /todo >}} + +Finally, \\(\\text{Eq}\\) performs the equality check: + +{{< todo >}}image of stack of factorial call with boolean{{< /todo >}} + +Great! Now, it's time to branch. What happens if "true" is on top of +the stack? In that case, we no longer need any more information. +We always return 1 in this case. So, just like the function I described +earlier, we can do the following: + +``` +PushI 1 +Slide 2 +``` + +As before, we push the desired answer onto the stack: + +{{< todo >}}image of stack of factorial call with 1 on the stack{{< /todo >}} + +Then, to follow convention 3, we must get rid of the arguments. We do this by using \\(\\text{Slide}\\): + +{{< todo >}}image of stack of factorial call with only 1 on the stack{{< /todo >}} + +Great! The \\(\\textbf{if}\\) branch is now done, and we're left with the correct answer on the stack. +Excellent! + +It's the recursive case that's more interesting. To make the recursive call, we must carefully +set up our stack. Just like before, the function must be an argument to itself, and it's found +lower on the stack, so we push it first: + +``` +Offset 1 +``` + +The result is as follows: + +{{< todo >}}image of stack of factorial call with extra function on top{{< /todo >}} + +Next, we must compute \\(n-1\\). This is pretty standard stuff: + +``` +Offset 1 +PushI -1 +Add +``` + +Why these three instructions? Well, with the function now on the top of the stack, the number argument is somewhat +buried, and thus, we need to use \\(\\text{Offset} \\; 1\\) to get to it: + +{{< todo >}}image of stack of factorial call with extra function and number on top{{< /todo >}} + +Then, we push a negative number, and add it to to the number on top. We end up with: + +{{< todo >}}image of stack of factorial call with extra function and number-1 on top{{< /todo >}} + +Finally, we have our arguments in order as per convention 2. To follow convention 1, we must +now push the function onto the top of the stack: + +``` +Offset 1 +``` + +The stack is now as follows: + +{{< todo >}}image of stack of factorial call with extra function and number-1 +and extra function on top{{< /todo >}} + +Good! With the preparations for the function call now complete, we take +the leap: + +``` +Call +``` + +If the function behaves as promised, this will remove the top 3 elements +from the stack. The top element, which is the function itself, will +be removed by the \\(\\text{Call}\\) operator. The two next two elements +will be removed from the stack and replaced with the result of the function +as per convention 2. The rest of the stack will remain untouched as +per convention 4. We thus expect the stack to look as follows: + +{{< todo >}}image of stack of factorial call with with (n-1)! on top{{< /todo >}} + +We're almost there! What's left is to perform the multiplication (we're +safe to destroy the argument now, since we will not be needing it after +this), and clean up the stack: + +``` +Mul +Slide 1 +``` + +The multiplication leaves us with \\(n(n-1)! = n!\\) on top of the stack, +and the function argument below it: + +{{< todo >}}image of stack of factorial call with with n! on top{{< /todo >}} + +We then use \\(\\text{Slide}\\) so that only the factorial is on the +stack, satisfying convention 3: + +{{< todo >}}image of stack of factorial call with with n! on top{{< /todo >}} + +That's it! We have successfully executed the recursive case. The whole +function is now as follows: + +``` +Offset 0 +PushI 0 +Eq +if { + PushI 1 + Slide 2 +} else { + Offset 1 + Offset 1 + PushI -1 + Add + Offset 1 + Call + Mul + Slide +} +``` + +We can now invoke this function to compute \\(5!\\) as follows: + +``` +func { ... } +PushI 5 +Offset 1 +Call +``` + +Awesome! That's about it. We have made a stack-based language with full +support for recursion and procedures. I hope this was helpful.