blog-static/content/blog/07_compiler_runtime.md

3.0 KiB

title date draft tags
Compiling a Functional Language Using C++, Part 7 - Runtime 2019-08-06T14:26:38-07:00 true
C and C++
Functional Languages
Compilers

Wikipedia has the following definition for a runtime:

A [runtime] primarily implements portions of an execution model.

We know what our execution model is! We talked about it in Part 5 - it's the lazy graph reduction we've been talking about. Creating and manipulating graph nodes is slightly above hardware level, and all programs in our functional language will rely on such manipulation (it's how they run!). Furthermore, most G-machine instructions are also above hardware level (especially unwind!).

Push and Slide and other instructions are pretty complex instructions. Most computers aren't stack machines. We'll have to implement our own stack, and whenever a graph-building function will want to modify the stack, it will have to call library routines for our stack implementation:

void stack_push(struct stack* s, struct node_s* n);
struct node_s* stack_slide(struct stack* s, size_t c);
/* other stack operations */

Furthermore, we observe that Unwind does a lot of the heavy lifting in our G-machine definition. After we build the graph, Unwind is what picks it apart and performs function calls. Furthermore, Unwind pushes Unwind back on the stack: once you've hit it, you're continuing to Unwind until you reach a function call. This effectively means we can implement Unwind as a loop:

while(1) {
    // Check for Unwind's first rule
    // Check for Unwind's second rule
    // ...
}

In this implementation, Unwind is in charge. We won't need to insert the Unwind operations at the end of our generated functions, and you may have noticed we've already been following this strategy in our implementation of the G-machine compilation.

We can start working on an implementation of the runtime right now, beginning with the nodes:

{{< codelines "C++" "compiler/07/runtime.c" 5 46 >}}

We have a variety of different nodes that can be on the stack, but without the magic of C++'s vtable and RTTI, we have to take care of the bookkeeping ourselves. We add an enum, node_tag, which we will use to indicate what type of node we're looking at. We also add a "base class" node_base, which contains the fields that all nodes must contain (only tag at the moment). We then add to the beginning of each node struct a member of type node_base. With this, a pointer to a node struct can be interpreted as a pointer to node_base, which is our lowest common denominator. To go back, we check the tag of node_base, and cast the pointer appropriately. This way, we mimic inheritance, in a very basic manner.

We also add an alloc_node, which allocates a region of memory big enough to be any node. We do this because we sometimes mutate nodes (replacing expressions with the results of their evaluation), changing their type. We then want to be able to change a node without reallocating memory. Since the biggest node we have is node_app, that's the one we choose.