2019-10-15 11:13:13 -07:00
|
|
|
---
|
|
|
|
title: Compiling a Functional Language Using C++, Part 7 - Runtime
|
|
|
|
date: 2019-08-06T14:26:38-07:00
|
|
|
|
draft: true
|
|
|
|
tags: ["C and C++", "Functional Languages", "Compilers"]
|
|
|
|
---
|
|
|
|
Wikipedia has the following definition for a __runtime__:
|
|
|
|
|
|
|
|
> A [runtime] primarily implements portions of an execution model.
|
|
|
|
|
|
|
|
We know what our execution model is! We talked about it in Part 5 - it's the
|
|
|
|
lazy graph reduction we've been talking about. Creating and manipulating
|
|
|
|
graph nodes is slightly above hardware level, and all programs in our
|
|
|
|
functional language will rely on such manipulation (it's how they run!). Furthermore,
|
|
|
|
most G-machine instructions are also above hardware level (especially unwind!).
|
|
|
|
|
|
|
|
Push and Slide and other instructions are pretty complex instructions.
|
|
|
|
Most computers aren't stack machines. We'll have to implement
|
|
|
|
our own stack, and whenever a graph-building function will want to modify
|
|
|
|
the stack, it will have to call library routines for our stack implementation:
|
|
|
|
|
|
|
|
```C
|
2019-10-26 20:30:29 -07:00
|
|
|
void stack_push(struct stack* s, struct node_s* n);
|
|
|
|
struct node_s* stack_slide(struct stack* s, size_t c);
|
|
|
|
/* other stack operations */
|
2019-10-15 11:13:13 -07:00
|
|
|
```
|
|
|
|
|
|
|
|
Furthermore, we observe that Unwind does a lot of the heavy lifting in our
|
|
|
|
G-machine definition. After we build the graph,
|
|
|
|
Unwind is what picks it apart and performs function calls. Furthermore,
|
|
|
|
Unwind pushes Unwind back on the stack: once you've hit it,
|
|
|
|
you're continuing to Unwind until you reach a function call. This
|
|
|
|
effectively means we can implement Unwind as a loop:
|
|
|
|
|
|
|
|
```C
|
|
|
|
while(1) {
|
|
|
|
// Check for Unwind's first rule
|
|
|
|
// Check for Unwind's second rule
|
|
|
|
// ...
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
In this implementation, Unwind is in charge. We won't need to insert
|
2019-10-26 20:30:29 -07:00
|
|
|
the Unwind operations at the end of our generated functions, and you
|
|
|
|
may have noticed we've already been following this strategy in our
|
|
|
|
implementation of the G-machine compilation.
|
|
|
|
|
|
|
|
We can start working on an implementation of the runtime right now,
|
|
|
|
beginning with the nodes:
|
|
|
|
|
2019-10-30 00:19:56 -07:00
|
|
|
{{< codelines "C++" "compiler/07/runtime.c" 5 51 >}}
|
2019-10-26 20:30:29 -07:00
|
|
|
|
|
|
|
We have a variety of different nodes that can be on the stack, but without
|
|
|
|
the magic of C++'s `vtable` and RTTI, we have to take care of the bookkeeping
|
|
|
|
ourselves. We add an enum, `node_tag`, which we will use to indicate what
|
|
|
|
type of node we're looking at. We also add a "base class" `node_base`, which
|
|
|
|
contains the fields that all nodes must contain (only `tag` at the moment).
|
|
|
|
We then add to the beginning of each node struct a member of type
|
|
|
|
`node_base`. With this, a pointer to a node struct can be interpreted as a pointer
|
|
|
|
to `node_base`, which is our lowest common denominator. To go back, we
|
|
|
|
check the `tag` of `node_base`, and cast the pointer appropriately. This way,
|
|
|
|
we mimic inheritance, in a very basic manner.
|
|
|
|
|
|
|
|
We also add an `alloc_node`, which allocates a region of memory big enough
|
|
|
|
to be any node. We do this because we sometimes mutate nodes (replacing
|
|
|
|
expressions with the results of their evaluation), changing their type.
|
|
|
|
We then want to be able to change a node without reallocating memory.
|
|
|
|
Since the biggest node we have is `node_app`, that's the one we choose.
|
2019-10-30 00:19:56 -07:00
|
|
|
|
|
|
|
We now move on to implement some stack operations. Let's list them off:
|
|
|
|
|
|
|
|
* `stack_init` and `stack_free` - one allocates memory for the stack,
|
|
|
|
the other releases it.
|
|
|
|
* `stack_push`, `stack_pop` and `stack_peek` - the classic stack operations.
|
|
|
|
We have `_peek` to take an offset, so we can peek relative to the top of the stack.
|
|
|
|
* `stack_popn` - pop off some number of nodes instead of one.
|
|
|
|
* `stack_slide` - the slide we specified in the semantics. Keeps the top, deletes the
|
|
|
|
next several nodes.
|
|
|
|
* `stack_update` - turns the node at the offset into an indirection to the result,
|
|
|
|
which we will use for lazy evaluation (modifying expressions with their reduced forms).
|
|
|
|
* `stack_alloc` - allocate indirection nodes on the stack. We will use this later.
|
|
|
|
|
|
|
|
Here's the implementation:
|
|
|
|
{{< codelines "C++" "compiler/07/runtime.c" 53 113 >}}
|
|
|
|
|
|
|
|
Let's not talk about how this will connect to the code we generate. To get
|
|
|
|
a quick example, consider the `node_global` struct that we have declared above.
|
|
|
|
It has a member `function`, which is a __function pointer__ to a function
|
|
|
|
that takes a stack and returns void.
|
|
|
|
|
|
|
|
When we finally generate machine code for each of the functions
|
|
|
|
we have in our program, it will be made up of sequences of G-machine
|
|
|
|
operations expressed using assembly instructions. These instructions will still
|
|
|
|
have to manipulate the G-machine stack (they still represent G-machine operations!),
|
|
|
|
and thus, the resulting assembly subroutine will take as parameter a stack. It will
|
|
|
|
then construct the function's graph on that stack, as we've already seen. Thus,
|
|
|
|
we express a compiled top-level function as a subroutine that takes a stack,
|
|
|
|
and returns void. A global node holds in it the pointer to the function that it will call.
|
|
|
|
|
|
|
|
When our program will start, it will assume that there exists a top-level
|
|
|
|
function `main` that takes 0 parameters. It will take that function, call it
|
|
|
|
to produce the initial graph, and then let the unwind loop take care of the evaluation.
|
|
|
|
|
|
|
|
Thus, our program will initially look like this:
|
|
|
|
{{< codelines "C++" "compiler/07/runtime.c" 117 125 >}}
|