From c9a7fbf6ddbc56896e87c8c5360350c1ffc69f7d Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Tue, 14 Apr 2020 19:06:29 -0700 Subject: [PATCH] Finalize part 10 of compiler series --- .../11_compiler_polymorphic_data_types.md | 25 ++++++++++--------- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/content/blog/11_compiler_polymorphic_data_types.md b/content/blog/11_compiler_polymorphic_data_types.md index 9d12ef4..d50d43a 100644 --- a/content/blog/11_compiler_polymorphic_data_types.md +++ b/content/blog/11_compiler_polymorphic_data_types.md @@ -1,7 +1,6 @@ --- title: Compiling a Functional Language Using C++, Part 11 - Polymorphic Data Types -date: 2020-03-28T20:10:35-07:00 -draft: true +date: 2020-04-14T19:05:42-07:00 tags: ["C and C++", "Functional Languages", "Compilers"] --- [In part 10]({{< relref "10_compiler_polymorphism.md" >}}), we managed to get our @@ -99,7 +98,7 @@ Let's now enumerate all the possible forms that (mono)types can take in our syst It is convenient to treat regular types (like \(\text{Bool}\)) as type constructors of arity 0 (that is, type constructors with kind \(*\)). In effect, they take zero arguments and produce types (themselves). -{{< /sidenote >}} such as \\(\\text{List} \; \\text{Int}\\) or \\(\\text{Bool}\\). +{{< /sidenote >}} such as \\(\\text{List} \\; \\text{Int}\\) or \\(\\text{Bool}\\). 3. A function from one type to another, like \\(\\text{List} \\; a \\rightarrow \\text{Int}\\). Polytypes (type schemes) in our system can be all of the above, but may also include a "forall" @@ -125,8 +124,8 @@ L_U & \rightarrow \epsilon This grammar was actually too simple even for our monomorphically typed language! Since functions are not represented using a single uppercase variable, it wasn't possible for us to define constructors that accept as arguments anything other than integers and user-defined -data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested!) -To do so, we will define a new nonterminal, \\(Y\\), for types: +data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested). +To do all of these things, we will define a new nonterminal, \\(Y\\), for types: {{< latex >}} \begin{aligned} @@ -176,7 +175,8 @@ L_Y & \rightarrow \epsilon {{< /latex >}} Finally, we update the rules for the data type declaration, as well as for a single -constructor: +constructor. In these new rules, we use \\(L\_T\\) to mean a list of type variables. +The rules are as follows: {{< latex >}} \begin{aligned} @@ -257,7 +257,7 @@ In the future, if this becomes an issue, we will likely move to unique type identifiers. {{< /sidenote >}} Note also the more basic fact that we added arity to our `type_base`, -{{< sidenote "left" "base-arity-note" "since it may now be a type constructor instead." >}} +{{< sidenote "left" "base-arity-note" "since it may now be a type constructor instead of a plain type." >}} You may be wondering, why did we add arity to base types, rather than data types? Although so far, our language can only create type constructors from data type definitions, it's possible (or even likely) that we will have @@ -271,7 +271,7 @@ to include the new case of `type_app`. The adjusted function looks as follows: {{< codelines "C++" "compiler/11/type.cpp" 174 187 >}} -There another adjustment that we have to make to our type code. Recall +There is another adjustment that we have to make to our type code. Recall that we had code that implemented substitutions: replacing free variables with other types to properly implement our type schemes. There was a bug in that code, which becomes much more apparent when the substitution @@ -325,7 +325,7 @@ type variable to the final return type (which is something like `List a`), in the order they occur. 2. When the variables have been gathered into a set, we iterate over all constructors, and convert them into types by calling `to_type` -on their arguments, and assemble the resulting argument types into a function. +on their arguments, then assembling the resulting argument types into a function. This is not enough, however, {{< sidenote "right" "type-variables-note" "since constructors of types that accept type variables are polymorphic," >}} This is also not enough because without generalization using "forall", we are risking using type variables @@ -335,10 +335,11 @@ wanted type constructors to be monomorphic (but generic, with type variables) we instnatiate fresh type variables for every user-defined type variable, and substitute them appropriately. {{< /sidenote >}} as we have discussed above with \\(\\text{Nil}\\) and \\(\\text{Cons}\\). -To accomodate for this, we also add all type variables we've used to the "forall" quantifier -of a new type scheme, whose monotype is the result of our calls to `to_type`. +To accomodate for this, we also add all type variables to the "forall" quantifier +of a new type scheme, whose monotype is our newly assembled function type. This +type scheme is what we store as the type of the constructor. -This is the last major change we have to perform. The rest is cleanup: we have switched +This was the last major change we have to perform. The rest is cleanup: we have switched our system to dealing with type applications (sometimes with zero arguments), and we must bring the rest of the compiler up to speed with this change. For instance, we update `ast_int` to create a reference to an existing integer type during typechecking: