Finalize part 10 of compiler series
Some checks failed
continuous-integration/drone/push Build is failing

This commit is contained in:
Danila Fedorin 2020-04-14 19:06:29 -07:00
parent 1f00b6a3f8
commit c9a7fbf6dd

View File

@ -1,7 +1,6 @@
---
title: Compiling a Functional Language Using C++, Part 11 - Polymorphic Data Types
date: 2020-03-28T20:10:35-07:00
draft: true
date: 2020-04-14T19:05:42-07:00
tags: ["C and C++", "Functional Languages", "Compilers"]
---
[In part 10]({{< relref "10_compiler_polymorphism.md" >}}), we managed to get our
@ -99,7 +98,7 @@ Let's now enumerate all the possible forms that (mono)types can take in our syst
It is convenient to treat regular types (like \(\text{Bool}\)) as
type constructors of arity 0 (that is, type constructors with kind \(*\)).
In effect, they take zero arguments and produce types (themselves).
{{< /sidenote >}} such as \\(\\text{List} \; \\text{Int}\\) or \\(\\text{Bool}\\).
{{< /sidenote >}} such as \\(\\text{List} \\; \\text{Int}\\) or \\(\\text{Bool}\\).
3. A function from one type to another, like \\(\\text{List} \\; a \\rightarrow \\text{Int}\\).
Polytypes (type schemes) in our system can be all of the above, but may also include a "forall"
@ -125,8 +124,8 @@ L_U & \rightarrow \epsilon
This grammar was actually too simple even for our monomorphically typed language!
Since functions are not represented using a single uppercase variable, it wasn't possible for us
to define constructors that accept as arguments anything other than integers and user-defined
data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested!)
To do so, we will define a new nonterminal, \\(Y\\), for types:
data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested).
To do all of these things, we will define a new nonterminal, \\(Y\\), for types:
{{< latex >}}
\begin{aligned}
@ -176,7 +175,8 @@ L_Y & \rightarrow \epsilon
{{< /latex >}}
Finally, we update the rules for the data type declaration, as well as for a single
constructor:
constructor. In these new rules, we use \\(L\_T\\) to mean a list of type variables.
The rules are as follows:
{{< latex >}}
\begin{aligned}
@ -257,7 +257,7 @@ In the future, if this becomes an issue, we will likely move to unique
type identifiers.
{{< /sidenote >}} Note also the more basic fact that we added arity
to our `type_base`,
{{< sidenote "left" "base-arity-note" "since it may now be a type constructor instead." >}}
{{< sidenote "left" "base-arity-note" "since it may now be a type constructor instead of a plain type." >}}
You may be wondering, why did we add arity to base types, rather than data types?
Although so far, our language can only create type constructors from data type definitions,
it's possible (or even likely) that we will have
@ -271,7 +271,7 @@ to include the new case of `type_app`. The adjusted function looks as follows:
{{< codelines "C++" "compiler/11/type.cpp" 174 187 >}}
There another adjustment that we have to make to our type code. Recall
There is another adjustment that we have to make to our type code. Recall
that we had code that implemented substitutions: replacing free variables
with other types to properly implement our type schemes. There
was a bug in that code, which becomes much more apparent when the substitution
@ -325,7 +325,7 @@ type variable to the final return type (which is something like `List a`),
in the order they occur.
2. When the variables have been gathered into a set, we iterate
over all constructors, and convert them into types by calling `to_type`
on their arguments, and assemble the resulting argument types into a function.
on their arguments, then assembling the resulting argument types into a function.
This is not enough, however,
{{< sidenote "right" "type-variables-note" "since constructors of types that accept type variables are polymorphic," >}}
This is also not enough because without generalization using "forall", we are risking using type variables
@ -335,10 +335,11 @@ wanted type constructors to be monomorphic (but generic, with type variables) we
instnatiate fresh type variables for every user-defined type variable, and substitute them appropriately.
{{< /sidenote >}}
as we have discussed above with \\(\\text{Nil}\\) and \\(\\text{Cons}\\).
To accomodate for this, we also add all type variables we've used to the "forall" quantifier
of a new type scheme, whose monotype is the result of our calls to `to_type`.
To accomodate for this, we also add all type variables to the "forall" quantifier
of a new type scheme, whose monotype is our newly assembled function type. This
type scheme is what we store as the type of the constructor.
This is the last major change we have to perform. The rest is cleanup: we have switched
This was the last major change we have to perform. The rest is cleanup: we have switched
our system to dealing with type applications (sometimes with zero arguments), and we must
bring the rest of the compiler up to speed with this change. For instance, we update
`ast_int` to create a reference to an existing integer type during typechecking: