Finish and publish part 10 of compiler series
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
parent
5cccb97ede
commit
c53a8ba68e
@ -1,8 +1,7 @@
|
||||
---
|
||||
title: Compiling a Functional Language Using C++, Part 10 - Polymorphism
|
||||
date: 2020-02-29T20:09:37-08:00
|
||||
date: 2020-03-25T17:14:20-07:00
|
||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
||||
draft: true
|
||||
---
|
||||
|
||||
[In part 8]({{< relref "08_compiler_llvm.md" >}}), we wrote some pretty interesting programs in our little language.
|
||||
@ -54,7 +53,13 @@ One such powerful set of rules is the [Hindley-Milner type system](https://en.wi
|
||||
which we have previously alluded to. In fact, the rules we came up
|
||||
with were already very close to Hindley-Milner, with the exception of two:
|
||||
__generalization__ and __instantiation__. It's been quite a while since the last time we worked on typechecking, so I'm going
|
||||
to present a table with these new rules, as well as all of the ones that we previously used. I will also give a quick
|
||||
to present a table with these new rules, as well as all of the ones that we
|
||||
{{< sidenote "right" "rules-note" "previously used." >}}
|
||||
The rules aren't quite the same as the ones we used earlier;
|
||||
note that \(\sigma\) is used in place of \(\tau\) in the first rule,
|
||||
for instance. These changes are slight, and we'll talk about how the
|
||||
rules work together below.
|
||||
{{< /sidenote >}} I will also give a quick
|
||||
summary of each of these rules.
|
||||
|
||||
Rule|Name and Description
|
||||
@ -181,10 +186,10 @@ How about the following:
|
||||
|
||||
1. To every declared function, assign the type \\(a \\rightarrow ... \\rightarrow y \\rightarrow z\\),
|
||||
where
|
||||
{{< sidenote "right" "arguments-note" "\(a\) through \(y\) are the types of the arguments to the function" >}}
|
||||
{{< sidenote "right" "arguments-note" "\(a\) through \(y\) are the types of the arguments to the function," >}}
|
||||
Of course, there can be more or less than 25 arguments to any function. This is just a generalization;
|
||||
we use as many input types as are needed.
|
||||
{{< /sidenote >}}, and \\(z\\) is the function's
|
||||
{{< /sidenote >}} and \\(z\\) is the function's
|
||||
return type.
|
||||
2. We typecheck each declared function, using the __Var__, __Case__, __App__, and __Inst__ rules.
|
||||
3. Whatever type variables we don't fill in, we assume can be filled in with any type,
|
||||
@ -367,8 +372,8 @@ to establish a topological order.
|
||||
Following these, we have three public function definitions:
|
||||
* `add_function` adds a vertex to the graph. Sometimes, a function does not
|
||||
reference any other functions, and would not appear in the list of edges.
|
||||
We will call this function to make sure that the function graph is aware
|
||||
of such functions. For convenience, this function returns the adjacency list
|
||||
We will call `add_function` to make sure that the function graph is aware
|
||||
of such independent functions. For convenience, `add_function` returns the adjacency list
|
||||
of the added function.
|
||||
* `add_edge` adds a new dependency between two functions.
|
||||
* `compute_order` method uses the internal methods described above to convert
|
||||
@ -403,7 +408,7 @@ group members were also already visited and added.
|
||||
|
||||
Once groups have been created, we use their functions' edges
|
||||
to create edges for the groups themselves, using `create_edges`.
|
||||
We avoid creating edges from a group to itself, to avoid
|
||||
We avoid creating edges from a group to itself, to prevent
|
||||
unnecessary cycles. While constructing the edges, we also
|
||||
increment the relevant indegree counter.
|
||||
|
||||
@ -540,7 +545,7 @@ tree node has a `type_env_ptr`. Furthermore, `typecheck` should no longer call
|
||||
|
||||
Don't worry about `instantiate` for now; that's coming up. Similarly to
|
||||
`ast_lid`, `ast_case::typecheck` will no longer introduce new bindings,
|
||||
and unify instead:
|
||||
but unify existing types via the `pattern`:
|
||||
|
||||
{{< codelines "C++" "compiler/10/ast.cpp" 152 169 >}}
|
||||
|
||||
@ -572,7 +577,16 @@ steps:
|
||||
it refers to "known" types. Add valid constructors to the global environment as functions.
|
||||
|
||||
We don't currently verify that types are "known"; A user could declare a list of `Floobs`,
|
||||
and never say what a `Floob` is. This isn't too big of an issue (good luck constructing
|
||||
and never say what a `Floob` is.
|
||||
{{< sidenote "right" "known-type-note" "This isn't too big of an issue" >}}
|
||||
Curiously, this flaw did lead to some valid programs being rejected. Since
|
||||
we had no notion of a "known" type, whenever data type constructors
|
||||
were created, every argument type was marked a "base" type;
|
||||
see <a href="https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/compiler/09/definition.cpp#L82">
|
||||
this line</a> if you're curious.
|
||||
This would cause pattern matching to fail on the tail of a list, with
|
||||
the "attempt to pattern match on non-data argument" error.
|
||||
{{< /sidenote >}}(good luck constructing
|
||||
a value of a non-existent type), but a mature compiler should prevent this from happening.
|
||||
|
||||
On the other hand, here are the steps for function definitions:
|
||||
@ -663,7 +677,7 @@ The separation of data and function definitions must be reconciled with code
|
||||
going back as far as the parser. While previously, we populated a single, global
|
||||
vector of definitions called `program`, we can no longer do that. Instead, we'll
|
||||
split our program into two maps, one for data types and one for functions. We
|
||||
use maps for convenience: since the groups generated by our function graph refer
|
||||
use maps for convenience: the groups generated by our function graph refer
|
||||
to functions by name, and it would be nice to quickly look up the data
|
||||
the names refer to. Rather than returning such maps, we change our semantic
|
||||
actions to simply insert new data into one of two global maps. Below
|
||||
@ -729,9 +743,9 @@ possibility that the variable has a polymorphic type, which needs to be speciali
|
||||
(potentially differently in every occurrence of the variable).
|
||||
|
||||
When talking about our new typechecking algorithm, we mentioned using __Gen__ to sprinkle
|
||||
polymorphism wherever possible. Whenever possible, __Gen__ will add free variables
|
||||
polymorphism into our program. If it can, __Gen__ will add free variables
|
||||
in a type to the "forall" quantifier at the front, making that type polymorphic.
|
||||
We implement this using a new `generalize` added to the `type_env`, which (as per
|
||||
We implement this using a new `generalize` method added to the `type_env`, which (as per
|
||||
convention) generalizes the type of a given variable as much as possible:
|
||||
|
||||
{{< codelines "C++" "compiler/10/type_env.cpp" 31 41 >}}
|
||||
|
Loading…
Reference in New Issue
Block a user