Compare commits
	
		
			No commits in common. "216e9e89b45979a66cfe65702f5b6eb84e4fc07f" and "4d8d80670695be4f9c13f8ca5a8bae5553add5f1" have entirely different histories.
		
	
	
		
			216e9e89b4
			...
			4d8d806706
		
	
		
| @ -1,45 +0,0 @@ | |||||||
| $basic-border: 1px solid #bfbfbf; |  | ||||||
| 
 |  | ||||||
| .gmachine-instruction { |  | ||||||
|     display: flex; |  | ||||||
|     border: $basic-border; |  | ||||||
|     border-radius: 2px; |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-instruction-name { |  | ||||||
|     padding: 10px; |  | ||||||
|     border-right: $basic-border; |  | ||||||
|     flex-grow: 1; |  | ||||||
|     flex-basis: 20%; |  | ||||||
|     text-align: center; |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-instruction-sem { |  | ||||||
|     width: 100%; |  | ||||||
|     flex-grow: 4; |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-inner { |  | ||||||
|     border-bottom: $basic-border; |  | ||||||
|     width: 100%; |  | ||||||
| 
 |  | ||||||
|     &:last-child { |  | ||||||
|         border-bottom: none; |  | ||||||
|     } |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-inner-label { |  | ||||||
|     padding: 10px; |  | ||||||
|     font-weight: bold; |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-inner-text { |  | ||||||
|     padding: 10px; |  | ||||||
|     text-align: right; |  | ||||||
|     flex-grow: 1; |  | ||||||
| } |  | ||||||
| 
 |  | ||||||
| .gmachine-instruction-name, .gmachine-inner-label, .gmachine-inner { |  | ||||||
|     display: flex; |  | ||||||
|     align-items: center; |  | ||||||
| } |  | ||||||
| @ -4,7 +4,6 @@ date: 2019-08-06T14:26:38-07:00 | |||||||
| draft: true | draft: true | ||||||
| tags: ["C and C++", "Functional Languages", "Compilers"] | tags: ["C and C++", "Functional Languages", "Compilers"] | ||||||
| --- | --- | ||||||
| {{< gmachine_css >}} |  | ||||||
| We now have trees representing valid programs in our language, | We now have trees representing valid programs in our language, | ||||||
| and it's time to think about how to compile them into machine code, | and it's time to think about how to compile them into machine code, | ||||||
| to be executed on hardware. But __how should we execute programs__? | to be executed on hardware. But __how should we execute programs__? | ||||||
| @ -135,433 +134,12 @@ to apply a function, we'll follow the corresponding recipe for | |||||||
| that function, and end up with a new tree that we continue evaluating. | that function, and end up with a new tree that we continue evaluating. | ||||||
| 
 | 
 | ||||||
| ### G-machine | ### G-machine | ||||||
| "Instructions" is a very generic term. Specifically, we will be creating instructions | "Instructions" is a very generic term. We will be creating instructions | ||||||
| for a [G-machine](https://link.springer.com/chapter/10.1007/3-540-15975-4_50), | for a [G-machine](https://link.springer.com/chapter/10.1007/3-540-15975-4_50), | ||||||
| an abstract architecture which we will use to reduce our graphs. The G-machine | an abstract architecture which we will use to reduce our graphs. The G-machine | ||||||
| is stack-based - all operations push and pop items from a stack. The machine | is stack-based - all operations push and pop items from a stack. The machine | ||||||
| will also have a "dump", which is a stack of stacks; this will help with | will also have a "dump", which is a stack of stacks; this will help with | ||||||
| separating function calls. | separating function calls. | ||||||
| 
 | 
 | ||||||
| We will follow the same notation as Simon Peyton Jones in | Besides constructing graphs, the machine will also have operations that will aid | ||||||
| [his book](https://www.microsoft.com/en-us/research/wp-content/uploads/1992/01/student.pdf) | in evaluating graphs. | ||||||
| , which was my source of truth when implementing my compiler. The machine |  | ||||||
| will be executing instructions that we give it, and as such, it must have |  | ||||||
| an instruction queue, which we will reference as \\(i\\). We will write |  | ||||||
| \\(x:i\\) to mean "an instruction queue that starts with |  | ||||||
| an instruction x and ends with instructions \\(i\\)". A stack machine |  | ||||||
| obviously needs to have a stack - we will call it \\(s\\), and will |  | ||||||
| adopt a similar notation to the instruction queue: \\(a\_1, a\_2, a\_3 : s\\) |  | ||||||
| will mean "a stack with the top values \\(a\_1\\), \\(a\_2\\), and \\(a\_3\\), |  | ||||||
| and remaining instructions \\(s\\)". |  | ||||||
| 
 |  | ||||||
| There's one more thing the G-machine will have that we've not yet discussed at all, |  | ||||||
| and it's needed because of the following quip earlier in the post: |  | ||||||
| 
 |  | ||||||
| > When we evaluate a tree, we can substitute it in-place with what it evaluates to.  |  | ||||||
| 
 |  | ||||||
| How can we substitute a value in place? Surely we won't iterate over the entire |  | ||||||
| tree and look for an occurence of the tree we evaluted. Rather, wouldn't it be |  | ||||||
| nice if we could update all references to a tree to be something else? Indeed, |  | ||||||
| we can achieve this effect by using __pointers__. I don't mean specifically |  | ||||||
| C/C++ pointers - I mean the more general concept of "an address in memory". |  | ||||||
| The G-machine has a __heap__, much like the heap of a C/C++ process. We |  | ||||||
| can create a tree node on the heap, and then get an __address__ of the node. |  | ||||||
| We then have trees use these addresses to link their child nodes. |  | ||||||
| If we want to replace a tree node with its reduced form, we keep |  | ||||||
| its address the same, but change the value on the heap. |  | ||||||
| This way, all trees that reference the node we change become updated, |  | ||||||
| without us having to change them - their child address remains the same, |  | ||||||
| but the child has now been updated. We represent the heap |  | ||||||
| using \\(h\\). We write \\(h[a : v]\\) to say "the address \\(a\\) points |  | ||||||
| to value \\(v\\) in the heap \\(h\\)". Now you also know why we used |  | ||||||
| the letter \\(a\\) when describing values on the stack - the stack contains |  | ||||||
| addresses of (or pointers to) tree nodes. |  | ||||||
| 
 |  | ||||||
| _Compiling Functional Languages: a tutorial_ also keeps another component |  | ||||||
| of the G-machine, the __global map__, which maps function names to addresses of nodes |  | ||||||
| that represent them. We'll stick with this, and call this global map \\(m\\). |  | ||||||
| 
 |  | ||||||
| Finally, let's talk about what kind of nodes our trees will be made of. |  | ||||||
| We don't have to include every node that we've defined as a subclass of |  | ||||||
| `ast` - some nodes we can compile to instructions, without having to build |  | ||||||
| them. We will also include nodes that we didn't need for to represent expressions. |  | ||||||
| Here's the list of nodes types we'll have: |  | ||||||
| 
 |  | ||||||
| * `NInt` - represents an integer. |  | ||||||
| * `NApp` - represents an application (has two children). |  | ||||||
| * `NGlobal` - represents a global function (like the `f` in `f x`). |  | ||||||
| * `NInd` - an "indrection" node that points to another node. This will help with "replacing" a node. |  | ||||||
| * `NData` - a "packed" node that will represent a constructor with all the arguments. |  | ||||||
| 
 |  | ||||||
| With these nodes in mind, let's try defining some instructions for the G-machine. |  | ||||||
| We start with instructions we'll use to assemble new version of function body trees as we discussed above. |  | ||||||
| First up is __PushInt__: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "PushInt" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{PushInt} \; n : i \quad s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a : s \quad h[a : \text{NInt} \; n] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Push an integer \(n\) onto the stack. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Let's go through this. We start with an instruction queue |  | ||||||
| with `PushInt n` on top. We allocate a new `NInt` with the |  | ||||||
| number `n` on the heap at address \\(a\\). We then push |  | ||||||
| the address of the `NInt` node on top of the stack. Next, |  | ||||||
| __PushGlobal__: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "PushGlobal" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{PushGlobal} \; f : i \quad s \quad h \quad m[f : a] \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a : s \quad h \quad m[f : a] \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Push a global function \(f\) onto the stack. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We don't allocate anything new on the heap for this one -  |  | ||||||
| we already have a node for the global function. Next up, |  | ||||||
| __Push__: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Push" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Push} \; n : i \quad a_0, a_1, ..., a_n : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a_n, a_0, a_1, ..., a_n : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Push a value at offset \(n\) from the top of the stack onto the stack. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We define this instruction to work if and only if there exists an address |  | ||||||
| on the stack at offset \\(n\\). We take the value at that offset, and |  | ||||||
| push it onto the stack again. This can be helpful for something like |  | ||||||
| `f x x`, where we use the same tree twice. Speaking of that - let's |  | ||||||
| define an instruction to combine two nodes into an application: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "MkApp" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{MkApp} : i \quad a_0, a_1 : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a : s \quad h[ a : \text{NApp} \; a_0 \; a_1] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Apply a function at the top of the stack to a value after it. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We pop two things off the stack: first, the thing we're applying, then |  | ||||||
| the thing we apply it to. We then create a new node on the heap |  | ||||||
| that is an `NApp` node, with its two children being the nodes we popped off. |  | ||||||
| Finally, we push it onto the stack. |  | ||||||
| 
 |  | ||||||
| Let's try use these instructions to get a feel for it. |  | ||||||
| {{< todo >}}Add an example, probably without notation.{{< /todo >}} |  | ||||||
| 
 |  | ||||||
| Having defined instructions to __build__ graphs, it's now time |  | ||||||
| to move on to instructions to __reduce__ graphs - after all, |  | ||||||
| we're performing graph reduction. A crucial instruction for the |  | ||||||
| G-machine is __Unwind__. What Unwind does depends on what |  | ||||||
| nodes are on the stack. Its name comes from how it behaves |  | ||||||
| when the top of the stack is an `NApp` node that is at |  | ||||||
| the top of a potentially long chain of applications: given |  | ||||||
| an application node, it pushes its left hand side onto the stack. |  | ||||||
| It then __continues to run Unwind__. This is effectively a while loop: |  | ||||||
| applications nodes continue to be expanded this way until the left |  | ||||||
| hand side of an application is finally something |  | ||||||
| that __isn't__ an application. Let's write this rule as follows: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Unwind-App" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Unwind} : i \quad a : s \quad h[a : \text{NApp} \; a_0 \; a_1] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( \text{Unwind} : i \quad a_0, a : s \quad h[ a : \text{NApp} \; a_0 \; a_1] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Unwind an application by pushing its left node. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Let's talk about what happens when Unwind hits a node that isn't an application. Of all nodes |  | ||||||
| we have described, `NGlobal` seems to be the most likely to be on top of the stack after |  | ||||||
| an application chain has finished unwinding. In this case we want to run the instructions |  | ||||||
| for building the referenced global function. Naturally, these instructions |  | ||||||
| may reference the arguments of the application. We can find the first argument |  | ||||||
| by looking at offset 1 on the stack, which will be an `NApp` node, and then going |  | ||||||
| to its right child. The same can be done for the second and third arguments, if |  | ||||||
| they exist. But this doesn't feel right - we don't want to constantly be looking |  | ||||||
| at the right child of a node on the stack. Instead, we replace each application |  | ||||||
| node on the stack with its right child. Once that's done, we run the actual |  | ||||||
| code for the global function: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Unwind-Global" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Unwind} : i \quad a, a_0, a_1, ..., a_n : s \quad h[\substack{a : \text{NGlobal} \; n \; c \\ a_k : \text{NApp} \; a_{k-1} \; a_k'}] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( c \quad a_0', a_1', ..., a_n', a_n : s \quad h[\substack{a : \text{NGlobal} \; n \; c \\ a_k : \text{NApp} \; a_{k-1} \; a_k'}] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Call a global function. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| In this rule, we used a general rule for \\(a\_k\\), in which \\(k\\) is any number |  | ||||||
| between 0 and \\(n\\). We also expect the `NGlobal` node to contain two parameters, |  | ||||||
| \\(n\\) and \\(c\\). \\(n\\) is the arity of the function (the number of arguments |  | ||||||
| it expects), and \\(c\\) are the instructions to construct the function's tree. |  | ||||||
| 
 |  | ||||||
| The attentive reader will have noticed a catch: we kept \\(a\_n\\) on the stack! |  | ||||||
| This once again goes back to replacing a node in-place. \\(a\_n\\) is the address of the "root" of the |  | ||||||
| whole expression we're simplifying. Thus, to replace the value at this address, we need to keep |  | ||||||
| the address until we have something to replace it with. |  | ||||||
| 
 |  | ||||||
| There's one more thing that can be found at the leftmost end of a tree of applications: `NInd`. |  | ||||||
| We simply replace `NInd` with the node it points to, and resume Unwind: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Unwind-Ind" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Unwind} : i \quad a : s \quad h[a : \text{NInd} \; a' ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( \text{Unwind} : i \quad a' : s \quad h[a : \text{NInd} \; a' ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Replace indirection node with its target. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We've talked about replacing a node, and we've talked about indirection, but we |  | ||||||
| haven't yet an instruction to perform these actions. Let's do so now: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Update" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Update} \; n : i \quad a,a_0,a_1,...a_n : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a_0,a_1,...,a_n : s \quad h[a_n : \text{NInd} \; a ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Transform node at offset into an indirection. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| This instruction pops an address from the top of the stack, and replaces |  | ||||||
| a node at the given offset with an indirection to the popped node. After |  | ||||||
| we evaluate a function call, we will use `update` to make sure it's |  | ||||||
| not evaluated again. |  | ||||||
| 
 |  | ||||||
| Now, let's talk about data structures. We have mentioned an `NData` node, |  | ||||||
| but we've given no explanation of how it will work. Obviously, we need |  | ||||||
| to distinguish values of a type created by different constructors: |  | ||||||
| If we have a value of type `List`, it could have been created either |  | ||||||
| using `Nil` or `Cons`. Depending on which constructor was used to |  | ||||||
| create a value of a type, we might treat it differently. Furthermore, |  | ||||||
| it's not always possible to know what constructor was used to |  | ||||||
| create what value at compile time. So, we need a way to know, |  | ||||||
| at runtime, how the value was constructed. We do this using |  | ||||||
| a __tag__. A tag is an integer value that will be contained in |  | ||||||
| the `NData` node. We assign a tag number to each constructor, |  | ||||||
| and when we create a node with that constructor, we set |  | ||||||
| the node's tag accordingly. This way, we can easily |  | ||||||
| tell if a `List` value is a `Nil` or a `Cons`, or |  | ||||||
| if a `Tree` value is a `Node` or a `Leaf`. |  | ||||||
| 
 |  | ||||||
| To operate on `NData` nodes, we will need two primitive operations: __Pack__ and __Split__. |  | ||||||
| Pack will create an `NData` node with a tag from some number of nodes |  | ||||||
| on the stack. These nodes will be placed into a dynamically |  | ||||||
| allocated array: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Pack" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Pack} \; t \; n : i \quad a_1,a_2,...a_n : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Pack \(n\) nodes from the stack into a node with tag \(t\). |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Split will do the opposite, by popping |  | ||||||
| of an `NData` node and moving the contents of its |  | ||||||
| array onto the stack: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Split" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Split} : i \quad a : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a_1, a_2, ...,a_n : s \quad h[a : \text{NData} \; t \; [a_1, a_2, ..., a_n] ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Unpack a data node on top of the stack. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| These two instructions are a good start, but we're missing something |  | ||||||
| fairly big: case analysis. After we've constructed a data type, |  | ||||||
| to perform operations on it, we want to figure out what |  | ||||||
| constructor and values which were used to create it. In order |  | ||||||
| to implement patterns and case expressions, we'll need another |  | ||||||
| instruction that's capable of making a decision based on |  | ||||||
| the tag of an `NData` node. We'll call this instruction __Jump__, |  | ||||||
| and define it to contain a mapping from tags to instructions |  | ||||||
| to be executed for a value of that tag. For instance, |  | ||||||
| if the constructor `Nil` has tag 0, and `Cons` has tag 1, |  | ||||||
| the mapping for the case expression of a length function |  | ||||||
| could be written as \\([0 \\rightarrow [\\text{PushInt} \; 0], 1 \\rightarrow [\\text{PushGlobal} \; \\text{length}, ...] ]\\). |  | ||||||
| Let's define the rule for it: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Jump" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Jump} [..., t \rightarrow i_t, ...] : i \quad a : s \quad h[a : \text{NData} \; t \; as ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i_t, i \quad a : s \quad h[a : \text{NData} \; t \; as ] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Execute instructions corresponding to a tag. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Alright, we've made it through the interesting instructions, |  | ||||||
| but there's still a few that are needed, but less shiny and cool. |  | ||||||
| For instance: imagine we've made a function call. As per the |  | ||||||
| rules for Unwind, we've placed the right hand sides of all applications |  | ||||||
| on the stack, and ran the instructions provided by the function, |  | ||||||
| creating a final graph. We then continue to reduce this final |  | ||||||
| graph. But we've left the function parameters on the stack! |  | ||||||
| This is untidy. We define a __Slide__ instruction, |  | ||||||
| which keeps the address at the top of the stack, but gets |  | ||||||
| rid of the next \\(n\\) addresses: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Slide" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Slide} \; n : i \quad a_0, a_1, ..., a_n : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a_0 : s \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Remove \(n\) addresses after the top from the stack. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Just a few more. Next up, we observe that we have not |  | ||||||
| defined any way for our G-machine to perform arithmetic, |  | ||||||
| or indeed, any primitive operations. Since we've |  | ||||||
| not defined any built-in type for booleans, |  | ||||||
| let's avoid talking about operations like `<`, `==`, |  | ||||||
| and so on (in fact, we've omitted them from the grammar so far). |  | ||||||
| So instead, let's talk about the [closed](https://en.wikipedia.org/wiki/Closure_(mathematics)) operations, |  | ||||||
| namely `+`, `-`, `*`, and `/`. We'll define a special instruction for |  | ||||||
| them, called __BinOp__: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "BinOp" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{BinOp} \; \text{op} : i \quad a_0, a_1 : s \quad h[\substack{a_0 : \text{NInt} \; n \\ a_1 : \text{NInt} \; m}] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad a : s \quad h[\substack{a_0 : \text{NInt} \; n \\ a_1 : \text{NInt} \; m \\ a : \text{NInt} \; (\text{op} \; n \; m)}] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Apply a binary operator on integers. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Nothing should be particularly surprising here: |  | ||||||
| the instruction pops two integers off the stack, applies the given |  | ||||||
| binary operation to them, and places the result on the stack. |  | ||||||
| 
 |  | ||||||
| We're not yet done with primitive operations, though. |  | ||||||
| We have a lazy graph reduction machine, which means |  | ||||||
| something like the expression `3*(2+6)` might not |  | ||||||
| be a binary operator applied to two `NInt` nodes. |  | ||||||
| We keep around graphs until they __really__ need to |  | ||||||
| be reduced. So now we need an instruction to trigger |  | ||||||
| reducing a graph, to say, "we need this value now". |  | ||||||
| We call this instruction __Eval__. This is where |  | ||||||
| the dump finally comes in! |  | ||||||
| 
 |  | ||||||
| {{< todo >}}Actually show the dump in the previous evaluasion rules.{{< /todo >}} |  | ||||||
| 
 |  | ||||||
| When we execute Eval, another graph becomes our "focus", and we switch |  | ||||||
| to a new stack. We obviously want to return from this once we've finished |  | ||||||
| evaluating what we "focused" on, so we must store the program state somewhere - |  | ||||||
| on the dump. Here's the rule: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Eval" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Eval} : i \quad a : s \quad d \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( [\text{Unwind}] \quad [a] \quad \langle i, s\rangle : d \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Evaluate graph to its normal form. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We store the current set of instructions and the current stack on the dump, |  | ||||||
| and start with only Unwind and the value we want to evaluate. |  | ||||||
| That does the job, but we're missing one thing - a way to return to |  | ||||||
| the state we placed onto the dump. To do this, we add __another__ |  | ||||||
| rule to Unwind: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Unwind-Return" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Unwind} : i \quad a : s \quad \langle i', s'\rangle : d \quad h[a : \text{NInt} \; n] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i' \quad a : s' \quad d \quad h[a : \text{NInt} \; n] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Return from Eval instruction. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| Just one more! Sometimes, it's possible for a tree node to reference itself. |  | ||||||
| For instance, Haskell defines the |  | ||||||
| [fixpoint combinator](https://en.wikipedia.org/wiki/Fixed-point_combinator) |  | ||||||
| as follows: |  | ||||||
| ```Haskell |  | ||||||
| fix f = let x = f x in x |  | ||||||
| ``` |  | ||||||
| 
 |  | ||||||
| In order to do this, an address that references a node must be present |  | ||||||
| while the node is being constructed. We define an instruction, |  | ||||||
| __Alloc__, which helps with that: |  | ||||||
| 
 |  | ||||||
| {{< gmachine "Alloc" >}} |  | ||||||
|     {{< gmachine_inner "Before">}} |  | ||||||
|     \( \text{Alloc} \; n : i \quad s \quad d \quad h \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "After" >}} |  | ||||||
|     \( i \quad s \quad d \quad h[a_k : \text{NInd} \; \text{null}] \quad m \) |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
|     {{< gmachine_inner "Description" >}} |  | ||||||
|     Allocate indirection nodes. |  | ||||||
|     {{< /gmachine_inner >}} |  | ||||||
| {{< /gmachine >}} |  | ||||||
| 
 |  | ||||||
| We can allocate an indirection on the stack, and call Update on it when |  | ||||||
| we've constructed a node. While we're constructing the tree, we can |  | ||||||
| refer to the indirection when a self-reference is required. |  | ||||||
| 
 |  | ||||||
| That's it for the instructions. Next up, we have to convert our expression |  | ||||||
| trees into such instructions. However, this has already gotten pretty long, |  | ||||||
| so we'll do it in the next post. |  | ||||||
|  | |||||||
		Loading…
	
		Reference in New Issue
	
	Block a user