From 10a44357600226797fa492bc3321b620d5a83e14 Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Sat, 2 Jul 2022 17:30:31 -0700 Subject: [PATCH] Continue work on the type theory draft --- content/blog/01_types_basics.md | 98 ++++++++++++++++++++++++++++++++- 1 file changed, 97 insertions(+), 1 deletion(-) diff --git a/content/blog/01_types_basics.md b/content/blog/01_types_basics.md index 2c6e3c2..88c5593 100644 --- a/content/blog/01_types_basics.md +++ b/content/blog/01_types_basics.md @@ -229,4 +229,100 @@ const x: number = 1.1 + 1; // just fine! ``` That concludes the second round of real-world examples. Let's take a look at formalizing -all of this mathematically. +all of this mathematically. As a starting point, we can look at a rule that matches the TypeScript +view of having only a single number type, \\(\\text{number}\\). This rule needs a little +bit "more" than the ones we've seen so far; we can't just blindly give things in the +form \\(a+b\\) the type \\(\\text{number}\\) (what if we're adding strings?). For our +rule to behave in the way we have in mind, it's necessary for us to add _premises_. +Before I explain any further, let me show you the rule. + +{{< latex >}} +\frac{e_1:\text{number}\quad e_2:\text{number}}{e_1+e_2:\text{number}} +{{< /latex >}} + +In the above (and elsewhere) we will use the metavariable \\(e\\) as a stand-in for +any _expression_ in our source language. Expressions are things such as `1`, +`x`, `1.0+someFunction(y)`, and so on. In other words, they're things we can evaluate +to a value. For the moment, we will avoid rules for checking _statements_ (like `let x = 5;`). + +Rules like the above consist of premises (above the line) and conclusions (below the line). +The conclusion is the claim / fact that we can determine from the rule. In this specific case, +the conclusion is that \\(e_1+e_2\\) has type \\(\\text{number}\\). +For this to be true, however, some conditions must be met; specifically, the sub-expressions +\\(e_1\\) and \\(e_2\\) must themselves be of type \\(\\text{number}\\). These are the premises. +Reading in plain English, we could pronounce this rule as: + +> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{number}\\), then \\(e_1+e_2\\) has type \\(\\text{number}\\). + +Notice that we don't care what the left and right operands are (we say they can be any expression). +We need not concern ourselves with how to compute _their_ type in this specific rule. Thus, the rule +would work for expressions like `1+1`, `(1+2)+(3+4)`, `1+x`, and so on, provided other rules +take care of figuring out the types of expressions like `1` and `x`. In this way, when we add +a feature to our language, we typically only need to add one or two associated rules; the +ones for other language features are typically unaffected. + +Just to get some more practice, let's take a look at a rule for adding strings. + +{{< latex >}} +\frac{e_1:\text{string}\quad e_2:\text{string}}{e_1+e_2:\text{string}} +{{< /latex >}} + +This rule is read as follows: + +> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{string}\\), then \\(e_1+e_2\\) has type \\(\\text{string}\\). + +These rules generally work in other languages. Things get more complicated in languages like Java and Rust, +where types for numbers are more precise (\\(\\text{int}\\) and \\(\\text{float}\\) instead of +\\(\\text{number}\\)). In these languages, we need rules for both. + +{{< latex >}} +\frac{e_1:\text{int}\quad e_2:\text{int}}{e_1+e_2:\text{int}} +\quad\quad +\frac{e_1:\text{float}\quad e_2:\text{float}}{e_1+e_2:\text{float}} +{{< /latex >}} + +But then, we also saw that Java is perfectly fine with adding a float to an integer, +and an integer to a float (the result is a float). Thus, we also add the following +two rules (but only in the Java case, since, as we have seen, Rust disallows +adding a float to an integer). + +{{< latex >}} +\frac{e_1:\text{int}\quad e_2:\text{float}}{e_1+e_2:\text{float}} +\quad\quad +\frac{e_1:\text{float}\quad e_2:\text{int}}{e_1+e_2:\text{float}} +{{< /latex >}} + +You might find this process of adding rules for each combination of operand +types tedious, and I would agree. In general, if a language +provides a lot of automatic conversions between types, we do _not_ explicitly +provide rules for all of the possible scenarios. Rather, we'd introduce a general +framework of subtypes, and then have a small number of rules that are responsible +for converting expressions from one type to another. That way, we'd only need to list +the integer-integer and the float-float rules. The rest would follow +from the conversion rules. + +{{< todo >}}Cite/ reference subtype stuff {{< /todo >}} + +Subtyping, however, is quite a bit beyond the scope of a "basics" +post. For the moment, we shall content ourselves with the tedious approach. + +Another thing to note is that we haven't yet seen rules for what programs are _incorrect_, +and we never will. When formalizing type systems we rarely (if ever) explicitly enumerate +cases that produce errors. Rather, we interpret the absence of matching rules to indicate +that something is wrong. Since no rule has premises that match \\(e_1:\\text{float}\\) and \\(e_2:\\text{string}\\), +we can infer that + {{< sidenote "right" "float-string-note" "given the rules so far," >}} +I'm trying to be careful here, since adding a float to a string +is perfectly valid in Java (the float is automatically converted to +a string, and the two are concatenated). +{{< /sidenote >}} +`1.0+"hello"` is invalid. The task of adding good type error messages, then, +is usually a more practical concern, and is undertaken by compiler developers +rather than type theorists. There are, of course, exceptions; check out, +for instance, these papers on improving type error messages, as well +as this tool that showed up only a week or two before I started writing +this article. + +{{< todo >}}Cite/ reference type error stuff {{< /todo >}} + +I think this is all I wanted to cover in this part.