Write explanations of AST refactor in compiler series

2019-10-08 21:42:25 -07:00
parent d3d73e0e9c
commit 7e9bd95846
7 changed files with 180 additions and 16 deletions
--- a/content/blog/06_compiler_semantics.md
+++ b/content/blog/06_compiler_semantics.md
@@ -253,7 +253,7 @@ We do not have to do this for `ast_uid`:
 {{< codelines "C++" "compiler/06/ast.cpp" 47 49 >}}

 On to `ast_binop`! This is the first time we have to change our environment.
-Once we build the right operand on the stack, every offset that we counted
+As we said earlier, once we build the right operand on the stack, every offset that we counted
 from the top of the stack will have been shifted by 1 (we see this
 in our compilation scheme for function application). So,
 we create a new environment with `env_offset`, and use that
@@ -274,3 +274,81 @@ for the exact same reason as before.
 Case expressions are the only thing left on the agenda. This
 is the time during which we have to perform desugaring. Here,
 though, we run into an issue: we don't have tags assigned to constructors!
+We need to adjust our code to keep track of the tags of the various
+constructors of a type. To do this, we add a subclass for the `type_base`
+struct, called `type_data`:
+
+{{< todo >}}Link code{{< /todo >}}
+
+When we create types from `definition_data`, we tag the corresponding constructors:
+
+{{< todo >}}Link code{{< /todo >}}
+
+Ah, but that doesn't solve the problem. Once we performed type checking, we don't keep
+the types that we computed for an AST node in the node. And obviously, we don't want
+to go looking for them again. Furthermore, we can't just look up a constructor
+in the environment, since we can well have patterns that don't have __any__ constructors:
+
+```
+match l {
+    l -> { 0 }
+}
+```
+
+So, we want each `ast` node to store its type (well, in practice we only need this for
+`ast_case`, but we might as well store it for all nodes). We can add it, no problem:
+
+{{< todo >}}Link code{{< /todo >}}
+
+Now, we can add another, non-virtual `typecheck` method (let's call it `typecheck_common`,
+since naming is hard). This method will call `typecheck`, and store the output into
+the `node_type` field.
+
+The signature is identical to `typecheck`, except it's neither virtual nor const:
+```
+type_ptr typecheck_common(type_mgr& mgr, const type_env& env);
+```
+
+And the implementation is as simple as you think:
+
+{{< todo >}}Link code{{< /todo >}}
+
+In client code (`definition_defn::typecheck_first` for instance), we should now
+use `typecheck_common` instead of `typecheck`. With that done, we're almost there.
+However, we're still missing something: most likely, the initial type assigned to any
+node is a `type_var`, or a type variable. In this case, `type_var` __needs__ the information 
+from `type_mgr`, which we will not be keeping around. Besides, it's cleaner to keep the actual type
+as a member of the node, not a variable type that references it. In order
+to address this, we write two conversion functions that call `resolve` on all
+types in an AST, given a type manager. After this is done, the type manager can be thrown away.
+The signatures of the functions are as follows:
+
+```
+void resolve_common(const type_mgr& mgr);
+virtual void resolve(const type_mgr& mgr) const = 0;
+```
+
+We also add the `resolve` method to `definition`, so that we can call it
+without having to run `dynamic_cast`. The implementation for `resolve_common`
+just resolves the type:
+
+{{< todo >}}Link code{{< /todo >}}
+
+The virtual `resolve` just calls `resolve_common` on an all `ast` children
+of a node. Here's a sample implementation from `ast_binop`:
+
+{{< todo >}}Link code{{< /todo >}}
+
+And here's the implementation of `resolve` on `definition_defn`:
+
+{{< todo >}}Link code{{< /todo >}}
+
+Finally, we call `resolve` from inside `typecheck_program` in `main.cpp`:
+
+{{< todo >}}Link code{{< /todo >}}
+
+Finally, we're ready to implement the code for compiling `ast_case`.
+
+{{< todo >}}Figure out how to keep all trees not requiring a type manager. {{< /todo >}}
+
+{{< todo >}}Backport bugfix in case's typecheck{{< /todo >}}