Drafts of code and markdown.

This commit is contained in:
2019-08-26 00:13:10 -07:00
parent d60d4e61bd
commit 918dfbe980
10 changed files with 335 additions and 14 deletions

View File

@@ -1,5 +1,5 @@
---
title: Compiling a Functional Language Using C++, Part 3 - Operations On Trees
title: Compiling a Functional Language Using C++, Part 3 - Type Checking
date: 2019-08-06T14:26:38-07:00
draft: true
tags: ["C and C++", "Functional Languages", "Compilers"]
@@ -24,7 +24,7 @@ programs we get from the parser valid? See for yourself:
```
data Bool = { True, False }
defn main { 3 + True }
defn main = { 3 + True }
```
Obviously, that's not right. The parser accepts it - this matches our grammar.
@@ -32,7 +32,7 @@ But giving meaning to this program is not easy, since we have no clear
way of adding 3 and some data type. Similarly:
```
defn main { 1 2 3 4 5 }
defn main = { 1 2 3 4 5 }
```
What is this? It's a sequence of applications, starting with `1 2`. Numbers
@@ -412,4 +412,125 @@ When we look up a variable name, we first look in this node we created.
If we don't find the variable we're looking for, we move on to the next
node. The benefit of this is that we won't be re-creating a map
for each branch, and just creating a node with the changes.
Let's implement exactly that:
Let's implement exactly that. the header:
{{< codeblock "C++" "compiler/03/env.hpp" >}}
And the source file:
{{< codeblock "C++" "compiler/03/env.cpp" >}}
Nothing should seem too surprising. Of note is the fact
that we're not using smart pointers for `scope`,
and that the child we create during the call
would be invalid if the parent goes out of scope
/ is released. We're gearing this towards
creating new environments on the stack, and we'll
take care not to let a parent go out of scope
before the child.
At least, it's time to declare a new type checking method.
We start with with a signature inside `ast`:
```
virtual type_ptr typecheck(type_mgr& mgr, const type_env& env) const;
```
We also implement the \\(\\text{matchp}\\) function
as a method `match` on `pattern` with the following signature:
```
virtual void match(type_ptr t, type_mgr& mgr, type_env& env) const;
```
We declare this in every subclass of `ast`. Let's take a look
at the implementation now:
{{< codeblock "C++" "compiler/03/ast.cpp" >}}
This looks good, but we're not done yet. We can type
check expressions, but our program ins't
made up of expressions. Rather, it's made up of
declarations. Further, we can't look at the declarations
in isolation. Consider these two functions:
```
defn double x = { x + x }
defn quadruple x = { double (double x) }
```
Assuming we have an environment containing `x` when we typecheck the body
of `double`, our algorithm will work out fine. But what about
`quadruple`? It needs to know what `double` is, or at least that it exists.
We could also envision two mutually recursive functions. Let's
assume we have the functions `eq` and `if` in global scope. We can write
two functions, `even` and `odd`:
```
defn even x = { if (eq x 0) True (odd (x-1)) }
defn odd x = { if (eq x 0) False (even (n-1)) }
```
`odd` needs to know about `even`, and `even` needs
to know about `odd`. Thus, before we do any checking,
we need to populate a global environment with __some__
type for each function we declare. We will
use what we know about the function for our
initial declaration: if the function
takes two parameters, its type will be `a -> b -> c`.
If it takes one parameter, its type will be `a -> b`.
What's more, though, is that we need to make sure
that the function's parameters are passed in the environment
when checking its body, and that these parameters' types
are the same as the placeholder types in the function's
"declaration".
We'll typecheck the program in two passes. First,
we'll go through each definition, and add any
functions it declares to the global scope. Then,
we will go through each definition again, and,
if it's a function, typecheck its body using
the previously fleshed out global scope.
We'll add two functions, `typecheck_first`
and `typecheck_second` corresponding to
these two stages. Their signatures:
```
virtual void typecheck_first(type_mgr& mgr, type_env& env);
virtual void typecheck_second(type_mgr& mgr, const type_env& env) const;
```
Furthermore, in the `definition_defn`, we will keep an
`std::vector` of `type_ptr`, in which the first element is the
type of the __last__ parameter, and so on. We switch around
the order of arguments because we build up the `a -> b -> ... -> x`
type signature from the right (`->` is right associative), and
thus we'll be creating the types right-to-left, too. We also
add a `type_ptr` field which holds the type for the function's
return value. We keep these two things in the `definition_defn` so
that they persist between the two typechecking stages: we want to use
the types from the first stage to aid in checking the body in the second stage.
Here's the code for the implementation:
{{< codeblock "C++" "compiler/03/definition.cpp" >}}
And finally, our updated main:
{{< codeblock "C++" "compiler/03/main.cpp" >}}
Notice that we manually add the types for our binary operators to the environment.
Let's run our project on a few examples. On our two "bad" examples, we get
the very eloquent error:
```
terminate called after throwing an instance of 'int'
[2] 9776 abort (core dumped) ./a.out < bad2.txt
```
That's what we get for throwing 0.
So far, our program has thrown in 100% of cases. Let's verify it actually
accepts valid programs! We'll try our very first example from today,
as well as these two:
{{< rawblock "compiler/03/works2.txt" >}}
{{< rawblock "compiler/03/works3.txt" >}}
All of our examples print the number of declarations in the program,
which means they don't throw 0. And so, we have typechecking!