Continue work on the type theory draft

2022-07-02 17:30:31 -07:00 · 2022-07-02 17:30:31 -07:00 · 10a4435760
commit 10a4435760
parent bc52c2c685
1 changed files with 97 additions and 1 deletions
--- a/content/blog/01_types_basics.md
+++ b/content/blog/01_types_basics.md
@ -229,4 +229,100 @@ const x: number = 1.1 + 1; // just fine!
 ```

 That concludes the second round of real-world examples. Let's take a look at formalizing
-all of this mathematically.
+all of this mathematically. As a starting point, we can look at a rule that matches the TypeScript
+view of having only a single number type, \\(\\text{number}\\). This rule needs a little
+bit "more" than the ones we've seen so far; we can't just blindly give things in the
+form \\(a+b\\) the type \\(\\text{number}\\) (what if we're adding strings?). For our
+rule to behave in the way we have in mind, it's necessary for us to add _premises_.
+Before I explain any further, let me show you the rule.
+
+{{< latex >}}
+\frac{e_1:\text{number}\quad e_2:\text{number}}{e_1+e_2:\text{number}}
+{{< /latex >}}
+
+In the above (and elsewhere) we will use the metavariable \\(e\\) as a stand-in for
+any _expression_ in our source language. Expressions are things such as `1`,
+`x`, `1.0+someFunction(y)`, and so on. In other words, they're things we can evaluate
+to a value. For the moment, we will avoid rules for checking _statements_ (like `let x = 5;`).
+
+Rules like the above consist of premises (above the line) and conclusions (below the line).
+The conclusion is the claim / fact that we can determine from the rule. In this specific case,
+the conclusion is that \\(e_1+e_2\\) has type \\(\\text{number}\\). 
+For this to be true, however, some conditions must be met; specifically, the sub-expressions
+\\(e_1\\) and \\(e_2\\) must themselves be of type \\(\\text{number}\\). These are the premises.
+Reading in plain English, we could pronounce this rule as:
+
+> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{number}\\), then \\(e_1+e_2\\) has type \\(\\text{number}\\).
+
+Notice that we don't care what the left and right operands are (we say they can be any expression).
+We need not concern ourselves with how to compute _their_ type in this specific rule. Thus, the rule
+would work for expressions like `1+1`, `(1+2)+(3+4)`, `1+x`, and so on, provided other rules
+take care of figuring out the types of expressions like `1` and `x`. In this way, when we add
+a feature to our language, we typically only need to add one or two associated rules; the
+ones for other language features are typically unaffected.
+
+Just to get some more practice, let's take a look at a rule for adding strings.
+
+{{< latex >}}
+\frac{e_1:\text{string}\quad e_2:\text{string}}{e_1+e_2:\text{string}}
+{{< /latex >}}
+
+This rule is read as follows:
+
+> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{string}\\), then \\(e_1+e_2\\) has type \\(\\text{string}\\).
+
+These rules generally work in other languages. Things get more complicated in languages like Java and Rust,
+where types for numbers are more precise (\\(\\text{int}\\) and \\(\\text{float}\\) instead of
+\\(\\text{number}\\)). In these languages, we need rules for both.
+
+{{< latex >}}
+\frac{e_1:\text{int}\quad e_2:\text{int}}{e_1+e_2:\text{int}}
+\quad\quad
+\frac{e_1:\text{float}\quad e_2:\text{float}}{e_1+e_2:\text{float}}
+{{< /latex >}}
+
+But then, we also saw that Java is perfectly fine with adding a float to an integer,
+and an integer to a float (the result is a float). Thus, we also add the following
+two rules (but only in the Java case, since, as we have seen, Rust disallows
+adding a float to an integer).
+
+{{< latex >}}
+\frac{e_1:\text{int}\quad e_2:\text{float}}{e_1+e_2:\text{float}}
+\quad\quad
+\frac{e_1:\text{float}\quad e_2:\text{int}}{e_1+e_2:\text{float}}
+{{< /latex >}}
+
+You might find this process of adding rules for each combination of operand
+types tedious, and I would agree. In general, if a language
+provides a lot of automatic conversions between types, we do _not_ explicitly
+provide rules for all of the possible scenarios. Rather, we'd introduce a general
+framework of subtypes, and then have a small number of rules that are responsible
+for converting expressions from one type to another. That way, we'd only need to list
+the integer-integer and the float-float rules. The rest would follow
+from the conversion rules.
+
+{{< todo >}}Cite/ reference subtype stuff {{< /todo >}}
+
+Subtyping, however, is quite a bit beyond the scope of a "basics"
+post. For the moment, we shall content ourselves with the tedious approach.
+
+Another thing to note is that we haven't yet seen rules for what programs are _incorrect_,
+and we never will. When formalizing type systems we rarely (if ever) explicitly enumerate
+cases that produce errors. Rather, we interpret the absence of matching rules to indicate
+that something is wrong. Since no rule has premises that match \\(e_1:\\text{float}\\) and \\(e_2:\\text{string}\\),
+we can infer that
+ {{< sidenote "right" "float-string-note" "given the rules so far," >}}
+I'm trying to be careful here, since adding a float to a string
+is perfectly valid in Java (the float is automatically converted to
+a string, and the two are concatenated).
+{{< /sidenote >}}
+`1.0+"hello"` is invalid. The task of adding good type error messages, then,
+is usually a more practical concern, and is undertaken by compiler developers
+rather than type theorists. There are, of course, exceptions; check out,
+for instance, these papers on improving type error messages, as well
+as this tool that showed up only a week or two before I started writing
+this article.
+
+{{< todo >}}Cite/ reference type error stuff {{< /todo >}}
+
+I think this is all I wanted to cover in this part.