Update "compiler: polymorphic data types" to new math delimiters
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
2fde7e5cf8
commit
6f20b17948
@ -42,11 +42,11 @@ empty.
|
||||
|
||||
Let's talk about `List` itself, now. I suggest that we ponder the following table:
|
||||
|
||||
\\(\\text{List}\\)|\\(\\text{Cons}\\)
|
||||
\(\text{List}\)|\(\text{Cons}\)
|
||||
----|----
|
||||
\\(\\text{List}\\) is not a type; it must be followed up with arguments, like \\(\\text{List} \\; \\text{Int}\\).|\\(\\text{Cons}\\) is not a list; it must be followed up with arguments, like \\(\\text{Cons} \\; 3 \\; \\text{Nil}\\).
|
||||
\\(\\text{List} \\; \\text{Int}\\) is in its simplest form.|\\(\\text{Cons} \\; 3 \\; \\text{Nil}\\) is in its simplest form.
|
||||
\\(\\text{List} \\; \\text{Int}\\) is a type.|\\(\\text{Cons} \\; 3 \\; \\text{Nil}\\) is a value of type \\(\\text{List} \\; \\text{Int}\\).
|
||||
\(\text{List}\) is not a type; it must be followed up with arguments, like \(\text{List} \; \text{Int}\).|\(\text{Cons}\) is not a list; it must be followed up with arguments, like \(\text{Cons} \; 3 \; \text{Nil}\).
|
||||
\(\text{List} \; \text{Int}\) is in its simplest form.|\(\text{Cons} \; 3 \; \text{Nil}\) is in its simplest form.
|
||||
\(\text{List} \; \text{Int}\) is a type.|\(\text{Cons} \; 3 \; \text{Nil}\) is a value of type \(\text{List} \; \text{Int}\).
|
||||
|
||||
I hope that the similarities are quite striking. I claim that
|
||||
`List` is quite similar to a constructor `Cons`, except that it occurs
|
||||
@ -74,18 +74,18 @@ for functional programming) or <a href="https://coq.inria.fr/">Coq</a> (to see h
|
||||
propositions and proofs can be encoded in a dependently typed language).
|
||||
{{< /sidenote >}}
|
||||
our type constructors will only take zero or more types as input, and produce
|
||||
a type as output. In this case, writing \\(\\text{Type}\\) becomes quite repetitive,
|
||||
and we will adopt the convention of writing \\(*\\) instead. The types of such
|
||||
a type as output. In this case, writing \(\text{Type}\) becomes quite repetitive,
|
||||
and we will adopt the convention of writing \(*\) instead. The types of such
|
||||
constructors are called [kinds](https://en.wikipedia.org/wiki/Kind_(type_theory)).
|
||||
Let's look at a few examples, just to make sure we're on the same page:
|
||||
|
||||
* The kind of \\(\\text{Bool}\\) is \\(*\\): it does not accept any
|
||||
* The kind of \(\text{Bool}\) is \(*\): it does not accept any
|
||||
type arguments, and is a type in its own right.
|
||||
* The kind of \\(\\text{List}\\) is \\(*\\rightarrow *\\): it takes
|
||||
* The kind of \(\text{List}\) is \(*\rightarrow *\): it takes
|
||||
one argument (the type of the things inside the list), and creates
|
||||
a type from it.
|
||||
* If we define a pair as `data Pair a b = { MkPair a b }`, then its
|
||||
kind is \\(* \\rightarrow * \\rightarrow *\\), because it requires
|
||||
kind is \(* \rightarrow * \rightarrow *\), because it requires
|
||||
two parameters.
|
||||
|
||||
As one final observation, we note that effectively, all we're doing is
|
||||
@ -94,24 +94,24 @@ type.
|
||||
|
||||
Let's now enumerate all the possible forms that (mono)types can take in our system:
|
||||
|
||||
1. A type can be a placeholder, like \\(a\\), \\(b\\), etc.
|
||||
1. A type can be a placeholder, like \(a\), \(b\), etc.
|
||||
2. A type can be a type constructor, applied to
|
||||
{{< sidenote "right" "zero-more-note" "zero ore more arguments," >}}
|
||||
It is convenient to treat regular types (like \(\text{Bool}\)) as
|
||||
type constructors of arity 0 (that is, type constructors with kind \(*\)).
|
||||
In effect, they take zero arguments and produce types (themselves).
|
||||
{{< /sidenote >}} such as \\(\\text{List} \\; \\text{Int}\\) or \\(\\text{Bool}\\).
|
||||
3. A function from one type to another, like \\(\\text{List} \\; a \\rightarrow \\text{Int}\\).
|
||||
{{< /sidenote >}} such as \(\text{List} \; \text{Int}\) or \(\text{Bool}\).
|
||||
3. A function from one type to another, like \(\text{List} \; a \rightarrow \text{Int}\).
|
||||
|
||||
Polytypes (type schemes) in our system can be all of the above, but may also include a "forall"
|
||||
quantifier at the front, generalizing the type (like \\(\\forall a \\; . \\; \\text{List} \\; a \\rightarrow \\text{Int}\\)).
|
||||
quantifier at the front, generalizing the type (like \(\forall a \; . \; \text{List} \; a \rightarrow \text{Int}\)).
|
||||
|
||||
Let's start implementing all of this. Why don't we start with the change to the syntax of our language?
|
||||
We have complicated the situation quite a bit. Let's take a look at the _old_ grammar
|
||||
for data type declarations (this is going back as far as [part 2]({{< relref "02_compiler_parsing.md" >}})).
|
||||
Here, \\(L\_D\\) is the nonterminal for the things that go between the curly braces of a data type
|
||||
declaration, \\(D\\) is the nonterminal representing a single constructor definition,
|
||||
and \\(L\_U\\) is a list of zero or more uppercase variable names:
|
||||
Here, \(L_D\) is the nonterminal for the things that go between the curly braces of a data type
|
||||
declaration, \(D\) is the nonterminal representing a single constructor definition,
|
||||
and \(L_U\) is a list of zero or more uppercase variable names:
|
||||
|
||||
{{< latex >}}
|
||||
\begin{aligned}
|
||||
@ -127,7 +127,7 @@ This grammar was actually too simple even for our monomorphically typed language
|
||||
Since functions are not represented using a single uppercase variable, it wasn't possible for us
|
||||
to define constructors that accept as arguments anything other than integers and user-defined
|
||||
data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested).
|
||||
To do all of these things, we will define a new nonterminal, \\(Y\\), for types:
|
||||
To do all of these things, we will define a new nonterminal, \(Y\), for types:
|
||||
|
||||
{{< latex >}}
|
||||
\begin{aligned}
|
||||
@ -136,8 +136,8 @@ Y & \rightarrow N
|
||||
\end{aligned}
|
||||
{{< /latex >}}
|
||||
|
||||
We make it right-recursive (because the \\(\\rightarrow\\) operator is right-associative). Next, we define
|
||||
a nonterminal for all types _except_ those constructed with the arrow, \\(N\\).
|
||||
We make it right-recursive (because the \(\rightarrow\) operator is right-associative). Next, we define
|
||||
a nonterminal for all types _except_ those constructed with the arrow, \(N\).
|
||||
|
||||
{{< latex >}}
|
||||
\begin{aligned}
|
||||
@ -148,15 +148,15 @@ N & \rightarrow ( Y )
|
||||
{{< /latex >}}
|
||||
|
||||
The first of the above rules allows a type to be a constructor applied to zero or more arguments
|
||||
(generated by \\(L\_Y\\)). The second rule allows a type to be a placeholder type variable. Finally,
|
||||
(generated by \(L_Y\)). The second rule allows a type to be a placeholder type variable. Finally,
|
||||
the third rule allows for any type (including functions, again) to occur between parentheses.
|
||||
This is so that higher-order functions, like \\((a \rightarrow b) \rightarrow a \rightarrow a \\),
|
||||
This is so that higher-order functions, like \((a \rightarrow b) \rightarrow a \rightarrow a \),
|
||||
can be represented.
|
||||
|
||||
Unfortunately, the definition of \\(L\_Y\\) is not as straightforward as we imagine. We could define
|
||||
it as just a list of \\(Y\\) nonterminals, but this would make the grammar ambigous: something
|
||||
Unfortunately, the definition of \(L_Y\) is not as straightforward as we imagine. We could define
|
||||
it as just a list of \(Y\) nonterminals, but this would make the grammar ambigous: something
|
||||
like `List Maybe Int` could be interpreted as "`List`, applied to types `Maybe` and `Int`", and
|
||||
"`List`, applied to type `Maybe Int`". To avoid this, we define a "type list element" \\(Y'\\),
|
||||
"`List`, applied to type `Maybe Int`". To avoid this, we define a "type list element" \(Y'\),
|
||||
which does not take arguments:
|
||||
|
||||
{{< latex >}}
|
||||
@ -167,7 +167,7 @@ Y' & \rightarrow ( Y )
|
||||
\end{aligned}
|
||||
{{< /latex >}}
|
||||
|
||||
We then make \\(L\_Y\\) a list of \\(Y'\\):
|
||||
We then make \(L_Y\) a list of \(Y'\):
|
||||
|
||||
{{< latex >}}
|
||||
\begin{aligned}
|
||||
@ -177,7 +177,7 @@ L_Y & \rightarrow \epsilon
|
||||
{{< /latex >}}
|
||||
|
||||
Finally, we update the rules for the data type declaration, as well as for a single
|
||||
constructor. In these new rules, we use \\(L\_T\\) to mean a list of type variables.
|
||||
constructor. In these new rules, we use \(L_T\) to mean a list of type variables.
|
||||
The rules are as follows:
|
||||
|
||||
{{< latex >}}
|
||||
@ -336,7 +336,7 @@ it will be once the type manager generates its first type variable, and things w
|
||||
wanted type constructors to be monomorphic (but generic, with type variables) we'd need to internally
|
||||
instnatiate fresh type variables for every user-defined type variable, and substitute them appropriately.
|
||||
{{< /sidenote >}}
|
||||
as we have discussed above with \\(\\text{Nil}\\) and \\(\\text{Cons}\\).
|
||||
as we have discussed above with \(\text{Nil}\) and \(\text{Cons}\).
|
||||
To accomodate for this, we also add all type variables to the "forall" quantifier
|
||||
of a new type scheme, whose monotype is our newly assembled function type. This
|
||||
type scheme is what we store as the type of the constructor.
|
||||
|
Loading…
Reference in New Issue
Block a user