2 changed files with 3 additions and 470 deletions
--- a/assets/scss/gmachine.scss
+++ b/assets/scss/gmachine.scss
@ -1,45 +0,0 @@
-$basic-border: 1px solid #bfbfbf;
-
-.gmachine-instruction {
-    display: flex;
-    border: $basic-border;
-    border-radius: 2px;
-}
-
-.gmachine-instruction-name {
-    padding: 10px;
-    border-right: $basic-border;
-    flex-grow: 1;
-    flex-basis: 20%;
-    text-align: center;
-}
-
-.gmachine-instruction-sem {
-    width: 100%;
-    flex-grow: 4;
-}
-
-.gmachine-inner {
-    border-bottom: $basic-border;
-    width: 100%;
-
-    &:last-child {
-        border-bottom: none;
-    }
-}
-
-.gmachine-inner-label {
-    padding: 10px;
-    font-weight: bold;
-}
-
-.gmachine-inner-text {
-    padding: 10px;
-    text-align: right;
-    flex-grow: 1;
-}
-
-.gmachine-instruction-name, .gmachine-inner-label, .gmachine-inner {
-    display: flex;
-    align-items: center;
-}
--- a/content/blog/05_compiler_execution.md
+++ b/content/blog/05_compiler_execution.md
@ -4,7 +4,6 @@ date: 2019-08-06T14:26:38-07:00
 draft: true
 tags: ["C and C++", "Functional Languages", "Compilers"]
 ---
-{{< gmachine_css >}}
 We now have trees representing valid programs in our language,
 and it's time to think about how to compile them into machine code,
 to be executed on hardware. But __how should we execute programs__?
@ -135,433 +134,12 @@ to apply a function, we'll follow the corresponding recipe for
 that function, and end up with a new tree that we continue evaluating.

 ### G-machine
-"Instructions" is a very generic term. Specifically, we will be creating instructions
+"Instructions" is a very generic term. We will be creating instructions
 for a [G-machine](https://link.springer.com/chapter/10.1007/3-540-15975-4_50),
 an abstract architecture which we will use to reduce our graphs. The G-machine
 is stack-based - all operations push and pop items from a stack. The machine
 will also have a "dump", which is a stack of stacks; this will help with
 separating function calls.

-We will follow the same notation as Simon Peyton Jones in
-[his book](https://www.microsoft.com/en-us/research/wp-content/uploads/1992/01/student.pdf)
-, which was my source of truth when implementing my compiler. The machine
-will be executing instructions that we give it, and as such, it must have
-an instruction queue, which we will reference as \\(i\\). We will write
-\\(x:i\\) to mean "an instruction queue that starts with
-an instruction x and ends with instructions \\(i\\)". A stack machine
-obviously needs to have a stack - we will call it \\(s\\), and will
-adopt a similar notation to the instruction queue: \\(a\_1, a\_2, a\_3 : s\\)
-will mean "a stack with the top values \\(a\_1\\), \\(a\_2\\), and \\(a\_3\\),
-and remaining instructions \\(s\\)".
-
-There's one more thing the G-machine will have that we've not yet discussed at all,
-and it's needed because of the following quip earlier in the post:
-
-> When we evaluate a tree, we can substitute it in-place with what it evaluates to. 
-
-How can we substitute a value in place? Surely we won't iterate over the entire
-tree and look for an occurence of the tree we evaluted. Rather, wouldn't it be
-nice if we could update all references to a tree to be something else? Indeed,
-we can achieve this effect by using __pointers__. I don't mean specifically
-C/C++ pointers - I mean the more general concept of "an address in memory".
-The G-machine has a __heap__, much like the heap of a C/C++ process. We
-can create a tree node on the heap, and then get an __address__ of the node.
-We then have trees use these addresses to link their child nodes.
-If we want to replace a tree node with its reduced form, we keep
-its address the same, but change the value on the heap.
-This way, all trees that reference the node we change become updated,
-without us having to change them - their child address remains the same,
-but the child has now been updated. We represent the heap
-using \\(h\\). We write \\(h[a : v]\\) to say "the address \\(a\\) points
-to value \\(v\\) in the heap \\(h\\)". Now you also know why we used
-the letter \\(a\\) when describing values on the stack - the stack contains
-addresses of (or pointers to) tree nodes.
-
-_Compiling Functional Languages: a tutorial_ also keeps another component
-of the G-machine, the __global map__, which maps function names to addresses of nodes
-that represent them. We'll stick with this, and call this global map \\(m\\).
-
-Finally, let's talk about what kind of nodes our trees will be made of.
-We don't have to include every node that we've defined as a subclass of
-`ast` - some nodes we can compile to instructions, without having to build
-them. We will also include nodes that we didn't need for to represent expressions.
-Here's the list of nodes types we'll have:
-
-* `NInt` - represents an integer.
-* `NApp` - represents an application (has two children).
-* `NGlobal` - represents a global function (like the `f` in `f x`).
-* `NInd` - an "indrection" node that points to another node. This will help with "replacing" a node.
-* `NData` - a "packed" node that will represent a constructor with all the arguments.
-
-With these nodes in mind, let's try defining some instructions for the G-machine.
-We start with instructions we'll use to assemble new version of function body trees as we discussed above.
-First up is __PushInt__:
-
-{{< gmachine "PushInt" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{PushInt} \; n : i \quad s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a : s \quad h[a : \text{NInt} \; n] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Push an integer \(n\) onto the stack.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Let's go through this. We start with an instruction queue
-with `PushInt n` on top. We allocate a new `NInt` with the
-number `n` on the heap at address \\(a\\). We then push
-the address of the `NInt` node on top of the stack. Next,
-__PushGlobal__:
-
-{{< gmachine "PushGlobal" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{PushGlobal} \; f : i \quad s \quad h \quad m[f : a] \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a : s \quad h \quad m[f : a] \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Push a global function \(f\) onto the stack.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We don't allocate anything new on the heap for this one - 
-we already have a node for the global function. Next up,
-__Push__:
-
-{{< gmachine "Push" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Push} \; n : i \quad a_0, a_1, ..., a_n : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a_n, a_0, a_1, ..., a_n : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Push a value at offset \(n\) from the top of the stack onto the stack.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We define this instruction to work if and only if there exists an address
-on the stack at offset \\(n\\). We take the value at that offset, and
-push it onto the stack again. This can be helpful for something like
-`f x x`, where we use the same tree twice. Speaking of that - let's
-define an instruction to combine two nodes into an application:
-
-{{< gmachine "MkApp" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{MkApp} : i \quad a_0, a_1 : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a : s \quad h[ a : \text{NApp} \; a_0 \; a_1] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Apply a function at the top of the stack to a value after it.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We pop two things off the stack: first, the thing we're applying, then
-the thing we apply it to. We then create a new node on the heap
-that is an `NApp` node, with its two children being the nodes we popped off.
-Finally, we push it onto the stack.
-
-Let's try use these instructions to get a feel for it.
-{{< todo >}}Add an example, probably without notation.{{< /todo >}}
-
-Having defined instructions to __build__ graphs, it's now time
-to move on to instructions to __reduce__ graphs - after all,
-we're performing graph reduction. A crucial instruction for the
-G-machine is __Unwind__. What Unwind does depends on what
-nodes are on the stack. Its name comes from how it behaves
-when the top of the stack is an `NApp` node that is at
-the top of a potentially long chain of applications: given
-an application node, it pushes its left hand side onto the stack.
-It then __continues to run Unwind__. This is effectively a while loop:
-applications nodes continue to be expanded this way until the left
-hand side of an application is finally something
-that __isn't__ an application. Let's write this rule as follows:
-
-{{< gmachine "Unwind-App" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Unwind} : i \quad a : s \quad h[a : \text{NApp} \; a_0 \; a_1] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( \text{Unwind} : i \quad a_0, a : s \quad h[ a : \text{NApp} \; a_0 \; a_1] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Unwind an application by pushing its left node.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Let's talk about what happens when Unwind hits a node that isn't an application. Of all nodes
-we have described, `NGlobal` seems to be the most likely to be on top of the stack after
-an application chain has finished unwinding. In this case we want to run the instructions
-for building the referenced global function. Naturally, these instructions
-may reference the arguments of the application. We can find the first argument
-by looking at offset 1 on the stack, which will be an `NApp` node, and then going
-to its right child. The same can be done for the second and third arguments, if
-they exist. But this doesn't feel right - we don't want to constantly be looking
-at the right child of a node on the stack. Instead, we replace each application
-node on the stack with its right child. Once that's done, we run the actual
-code for the global function:
-
-{{< gmachine "Unwind-Global" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Unwind} : i \quad a, a_0, a_1, ..., a_n : s \quad h[\substack{a : \text{NGlobal} \; n \; c \\ a_k : \text{NApp} \; a_{k-1} \; a_k'}] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( c \quad a_0', a_1', ..., a_n', a_n : s \quad h[\substack{a : \text{NGlobal} \; n \; c \\ a_k : \text{NApp} \; a_{k-1} \; a_k'}] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Call a global function.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-In this rule, we used a general rule for \\(a\_k\\), in which \\(k\\) is any number
-between 0 and \\(n\\). We also expect the `NGlobal` node to contain two parameters,
-\\(n\\) and \\(c\\). \\(n\\) is the arity of the function (the number of arguments
-it expects), and \\(c\\) are the instructions to construct the function's tree.
-
-The attentive reader will have noticed a catch: we kept \\(a\_n\\) on the stack!
-This once again goes back to replacing a node in-place. \\(a\_n\\) is the address of the "root" of the
-whole expression we're simplifying. Thus, to replace the value at this address, we need to keep
-the address until we have something to replace it with.
-
-There's one more thing that can be found at the leftmost end of a tree of applications: `NInd`.
-We simply replace `NInd` with the node it points to, and resume Unwind:
-
-{{< gmachine "Unwind-Ind" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Unwind} : i \quad a : s \quad h[a : \text{NInd} \; a' ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( \text{Unwind} : i \quad a' : s \quad h[a : \text{NInd} \; a' ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Replace indirection node with its target.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We've talked about replacing a node, and we've talked about indirection, but we
-haven't yet an instruction to perform these actions. Let's do so now:
-
-{{< gmachine "Update" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Update} \; n : i \quad a,a_0,a_1,...a_n : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a_0,a_1,...,a_n : s \quad h[a_n : \text{NInd} \; a ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Transform node at offset into an indirection.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-This instruction pops an address from the top of the stack, and replaces
-a node at the given offset with an indirection to the popped node. After
-we evaluate a function call, we will use `update` to make sure it's
-not evaluated again.
-
-Now, let's talk about data structures. We have mentioned an `NData` node,
-but we've given no explanation of how it will work. Obviously, we need
-to distinguish values of a type created by different constructors:
-If we have a value of type `List`, it could have been created either
-using `Nil` or `Cons`. Depending on which constructor was used to
-create a value of a type, we might treat it differently. Furthermore,
-it's not always possible to know what constructor was used to
-create what value at compile time. So, we need a way to know,
-at runtime, how the value was constructed. We do this using
-a __tag__. A tag is an integer value that will be contained in
-the `NData` node. We assign a tag number to each constructor,
-and when we create a node with that constructor, we set
-the node's tag accordingly. This way, we can easily
-tell if a `List` value is a `Nil` or a `Cons`, or
-if a `Tree` value is a `Node` or a `Leaf`.
-
-To operate on `NData` nodes, we will need two primitive operations: __Pack__ and __Split__.
-Pack will create an `NData` node with a tag from some number of nodes
-on the stack. These nodes will be placed into a dynamically
-allocated array:
-
-{{< gmachine "Pack" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Pack} \; t \; n : i \quad a_1,a_2,...a_n : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Pack \(n\) nodes from the stack into a node with tag \(t\).
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Split will do the opposite, by popping
-of an `NData` node and moving the contents of its
-array onto the stack:
-
-{{< gmachine "Split" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Split} : i \quad a : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a_1, a_2, ...,a_n : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Unpack a data node on top of the stack.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-These two instructions are a good start, but we're missing something
-fairly big: case analysis. After we've constructed a data type,
-to perform operations on it, we want to figure out what
-constructor and values which were used to create it. In order
-to implement patterns and case expressions, we'll need another
-instruction that's capable of making a decision based on
-the tag of an `NData` node. We'll call this instruction __Jump__,
-and define it to contain a mapping from tags to instructions
-to be executed for a value of that tag. For instance,
-if the constructor `Nil` has tag 0, and `Cons` has tag 1,
-the mapping for the case expression of a length function
-could be written as \\([0 \\rightarrow [\\text{PushInt} \; 0], 1 \\rightarrow [\\text{PushGlobal} \; \\text{length}, ...] ]\\).
-Let's define the rule for it:
-
-{{< gmachine "Jump" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Jump} [..., t \rightarrow i_t, ...] : i \quad a : s \quad h[a : \text{NData} \; t \; as ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i_t, i \quad a : s \quad h[a : \text{NData} \; t \; as ] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Execute instructions corresponding to a tag.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Alright, we've made it through the interesting instructions,
-but there's still a few that are needed, but less shiny and cool.
-For instance: imagine we've made a function call. As per the
-rules for Unwind, we've placed the right hand sides of all applications
-on the stack, and ran the instructions provided by the function,
-creating a final graph. We then continue to reduce this final
-graph. But we've left the function parameters on the stack!
-This is untidy. We define a __Slide__ instruction,
-which keeps the address at the top of the stack, but gets
-rid of the next \\(n\\) addresses:
-
-{{< gmachine "Slide" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Slide} \; n : i \quad a_0, a_1, ..., a_n : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a_0 : s \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Remove \(n\) addresses after the top from the stack.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Just a few more. Next up, we observe that we have not
-defined any way for our G-machine to perform arithmetic,
-or indeed, any primitive operations. Since we've
-not defined any built-in type for booleans,
-let's avoid talking about operations like `<`, `==`,
-and so on (in fact, we've omitted them from the grammar so far).
-So instead, let's talk about the [closed](https://en.wikipedia.org/wiki/Closure_(mathematics)) operations,
-namely `+`, `-`, `*`, and `/`. We'll define a special instruction for
-them, called __BinOp__:
-
-{{< gmachine "BinOp" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{BinOp} \; \text{op} : i \quad a_0, a_1 : s \quad h[\substack{a_0 : \text{NInt} \; n \\ a_1 : \text{NInt} \; m}] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad a : s \quad h[\substack{a_0 : \text{NInt} \; n \\ a_1 : \text{NInt} \; m \\ a : \text{NInt} \; (\text{op} \; n \; m)}] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Apply a binary operator on integers.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Nothing should be particularly surprising here:
-the instruction pops two integers off the stack, applies the given
-binary operation to them, and places the result on the stack.
-
-We're not yet done with primitive operations, though.
-We have a lazy graph reduction machine, which means
-something like the expression `3*(2+6)` might not
-be a binary operator applied to two `NInt` nodes.
-We keep around graphs until they __really__ need to
-be reduced. So now we need an instruction to trigger
-reducing a graph, to say, "we need this value now".
-We call this instruction __Eval__. This is where
-the dump finally comes in!
-
-{{< todo >}}Actually show the dump in the previous evaluasion rules.{{< /todo >}}
-
-When we execute Eval, another graph becomes our "focus", and we switch
-to a new stack. We obviously want to return from this once we've finished
-evaluating what we "focused" on, so we must store the program state somewhere -
-on the dump. Here's the rule:
-
-{{< gmachine "Eval" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Eval} : i \quad a : s \quad d \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( [\text{Unwind}] \quad [a] \quad \langle i, s\rangle : d \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Evaluate graph to its normal form.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We store the current set of instructions and the current stack on the dump,
-and start with only Unwind and the value we want to evaluate.
-That does the job, but we're missing one thing - a way to return to
-the state we placed onto the dump. To do this, we add __another__
-rule to Unwind:
-
-{{< gmachine "Unwind-Return" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Unwind} : i \quad a : s \quad \langle i', s'\rangle : d \quad h[a : \text{NInt} \; n] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i' \quad a : s' \quad d \quad h[a : \text{NInt} \; n] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Return from Eval instruction.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-Just one more! Sometimes, it's possible for a tree node to reference itself.
-For instance, Haskell defines the
-[fixpoint combinator](https://en.wikipedia.org/wiki/Fixed-point_combinator)
-as follows:
-```Haskell
-fix f = let x = f x in x
-```
-
-In order to do this, an address that references a node must be present
-while the node is being constructed. We define an instruction,
-__Alloc__, which helps with that:
-
-{{< gmachine "Alloc" >}}
-    {{< gmachine_inner "Before">}}
-    \( \text{Alloc} \; n : i \quad s \quad d \quad h \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "After" >}}
-    \( i \quad s \quad d \quad h[a_k : \text{NInd} \; \text{null}] \quad m \)
-    {{< /gmachine_inner >}}
-    {{< gmachine_inner "Description" >}}
-    Allocate indirection nodes.
-    {{< /gmachine_inner >}}
-{{< /gmachine >}}
-
-We can allocate an indirection on the stack, and call Update on it when
-we've constructed a node. While we're constructing the tree, we can
-refer to the indirection when a self-reference is required.
-
-That's it for the instructions. Next up, we have to convert our expression
-trees into such instructions. However, this has already gotten pretty long,
-so we'll do it in the next post.
+Besides constructing graphs, the machine will also have operations that will aid
+in evaluating graphs.