Update "compiler: execution" to new math delimiters

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
Danila Fedorin 2024-05-13 18:38:05 -07:00
parent 96545a899f
commit d3fa7336a2

View File

@ -147,19 +147,19 @@ We will follow the same notation as Simon Peyton Jones in
[his book](https://www.microsoft.com/en-us/research/wp-content/uploads/1992/01/student.pdf)
, which was my source of truth when implementing my compiler. The machine
will be executing instructions that we give it, and as such, it must have
an instruction queue, which we will reference as \\(i\\). We will write
\\(x:i\\) to mean "an instruction queue that starts with
an instruction x and ends with instructions \\(i\\)". A stack machine
obviously needs to have a stack - we will call it \\(s\\), and will
adopt a similar notation to the instruction queue: \\(a\_1, a\_2, a\_3 : s\\)
will mean "a stack with the top values \\(a\_1\\), \\(a\_2\\), and \\(a\_3\\),
and remaining instructions \\(s\\)". Finally, as we said, our stack
machine has a dump, which we will write as \\(d\\). On this dump,
an instruction queue, which we will reference as \(i\). We will write
\(x:i\) to mean "an instruction queue that starts with
an instruction x and ends with instructions \(i\)". A stack machine
obviously needs to have a stack - we will call it \(s\), and will
adopt a similar notation to the instruction queue: \(a_1, a_2, a_3 : s\)
will mean "a stack with the top values \(a_1\), \(a_2\), and \(a_3\),
and remaining instructions \(s\)". Finally, as we said, our stack
machine has a dump, which we will write as \(d\). On this dump,
we will push not only the current stack, but also the current
instructions that we are executing, so we may resume execution
later. We will write \\(\\langle i, s \\rangle : d\\) to mean
"a dump with instructions \\(i\\) and stack \\(s\\) on top,
followed by instructions and stacks in \\(d\\)".
later. We will write \(\langle i, s \rangle : d\) to mean
"a dump with instructions \(i\) and stack \(s\) on top,
followed by instructions and stacks in \(d\)".
There's one more thing the G-machine will have that we've not yet discussed at all,
and it's needed because of the following quip earlier in the post:
@ -179,14 +179,14 @@ its address the same, but change the value on the heap.
This way, all trees that reference the node we change become updated,
without us having to change them - their child address remains the same,
but the child has now been updated. We represent the heap
using \\(h\\). We write \\(h[a : v]\\) to say "the address \\(a\\) points
to value \\(v\\) in the heap \\(h\\)". Now you also know why we used
the letter \\(a\\) when describing values on the stack - the stack contains
using \(h\). We write \(h[a : v]\) to say "the address \(a\) points
to value \(v\) in the heap \(h\)". Now you also know why we used
the letter \(a\) when describing values on the stack - the stack contains
addresses of (or pointers to) tree nodes.
_Compiling Functional Languages: a tutorial_ also keeps another component
of the G-machine, the __global map__, which maps function names to addresses of nodes
that represent them. We'll stick with this, and call this global map \\(m\\).
that represent them. We'll stick with this, and call this global map \(m\).
Finally, let's talk about what kind of nodes our trees will be made of.
We don't have to include every node that we've defined as a subclass of
@ -218,7 +218,7 @@ First up is __PushInt__:
Let's go through this. We start with an instruction queue
with `PushInt n` on top. We allocate a new `NInt` with the
number `n` on the heap at address \\(a\\). We then push
number `n` on the heap at address \(a\). We then push
the address of the `NInt` node on top of the stack. Next,
__PushGlobal__:
@ -251,7 +251,7 @@ __Push__:
{{< /gmachine >}}
We define this instruction to work if and only if there exists an address
on the stack at offset \\(n\\). We take the value at that offset, and
on the stack at offset \(n\). We take the value at that offset, and
push it onto the stack again. This can be helpful for something like
`f x x`, where we use the same tree twice. Speaking of that - let's
define an instruction to combine two nodes into an application:
@ -274,11 +274,11 @@ that is an `NApp` node, with its two children being the nodes we popped off.
Finally, we push it onto the stack.
Let's try use these instructions to get a feel for it. In
order to conserve space, let's use \\(\\text{G}\\) for PushGlobal,
\\(\\text{I}\\) for PushInt, and \\(\\text{A}\\) for PushApp.
order to conserve space, let's use \(\text{G}\) for PushGlobal,
\(\text{I}\) for PushInt, and \(\text{A}\) for PushApp.
Let's say we want to construct a graph for `double 326`. We'll
use the instructions \\(\\text{I} \; 326\\), \\(\\text{G} \; \\text{double}\\),
and \\(\\text{A}\\). Let's watch these instructions play out:
use the instructions \(\text{I} \; 326\), \(\text{G} \; \text{double}\),
and \(\text{A}\). Let's watch these instructions play out:
{{< latex >}}
\begin{aligned}
[\text{I} \; 326, \text{G} \; \text{double}, \text{A}] & \quad s \quad & d \quad & h \quad & m[\text{double} : a_d] \\
@ -346,13 +346,13 @@ code for the global function:
{{< /gmachine_inner >}}
{{< /gmachine >}}
In this rule, we used a general rule for \\(a\_k\\), in which \\(k\\) is any number
between 1 and \\(n-1\\). We also expect the `NGlobal` node to contain two parameters,
\\(n\\) and \\(c\\). \\(n\\) is the arity of the function (the number of arguments
it expects), and \\(c\\) are the instructions to construct the function's tree.
In this rule, we used a general rule for \(a_k\), in which \(k\) is any number
between 1 and \(n-1\). We also expect the `NGlobal` node to contain two parameters,
\(n\) and \(c\). \(n\) is the arity of the function (the number of arguments
it expects), and \(c\) are the instructions to construct the function's tree.
The attentive reader will have noticed a catch: we kept \\(a\_{n-1}\\) on the stack!
This once again goes back to replacing a node in-place. \\(a\_{n-1}\\) is the address of the "root" of the
The attentive reader will have noticed a catch: we kept \(a_{n-1}\) on the stack!
This once again goes back to replacing a node in-place. \(a_{n-1}\) is the address of the "root" of the
whole expression we're simplifying. Thus, to replace the value at this address, we need to keep
the address until we have something to replace it with.
@ -451,7 +451,7 @@ and define it to contain a mapping from tags to instructions
to be executed for a value of that tag. For instance,
if the constructor `Nil` has tag 0, and `Cons` has tag 1,
the mapping for the case expression of a length function
could be written as \\([0 \\rightarrow [\\text{PushInt} \; 0], 1 \\rightarrow [\\text{PushGlobal} \; \\text{length}, ...] ]\\).
could be written as \([0 \rightarrow [\text{PushInt} \; 0], 1 \rightarrow [\text{PushGlobal} \; \text{length}, ...] ]\).
Let's define the rule for it:
{{< gmachine "Jump" >}}
@ -475,7 +475,7 @@ creating a final graph. We then continue to reduce this final
graph. But we've left the function parameters on the stack!
This is untidy. We define a __Slide__ instruction,
which keeps the address at the top of the stack, but gets
rid of the next \\(n\\) addresses:
rid of the next \(n\) addresses:
{{< gmachine "Slide" >}}
{{< gmachine_inner "Before">}}