Switch 'types: basics' to new math delimiters

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
Danila Fedorin 2024-05-13 18:28:36 -07:00
parent 291a1f0178
commit d9d5c8bf14

View File

@ -89,7 +89,7 @@ fixed set of types, and we are required to stick to names in that set. We have n
duty when using mathematical notation. The main goal of a mathematical definition
is not to run the code, or check if it's correct; it's to communicate something
to others. As long as others understand what you mean, you can do whatever you want.
I _chose_ to use the word \\(\\text{number}\\) to represent the type
I _chose_ to use the word \(\text{number}\) to represent the type
of numbers, mainly because it's _very_ clear what that means. A theorist writing
a paper might cringe at the verbosity of such a convention. My goal, however, is
to communicate things to _you_, dear reader, and I think it's best to settle for
@ -120,8 +120,8 @@ TNumber @ type(lit(?n), number) <- num(?n);
{{< bergamot_exercise label="bergamot; tweaking notation" preset="notation-preset" id="exercise-1" >}}
Bergamot, the interactive tool I've developed for doing exercises, supports
customizing the notation for rules. Try changing the \\(:\\) symbol to
the \\(\\sim\\) symbol (denoted in latex as `\sim`).
customizing the notation for rules. Try changing the \(:\) symbol to
the \(\sim\) symbol (denoted in latex as `\sim`).
To change the way that rules are rendered, click the "Presentation Rules"
tab in the "Rules" section. There will be a lot there: I've added rules for
@ -129,7 +129,7 @@ pretty-printing a fair amount of the standard programming languages notation.
Scroll down to `LatexTypeBin`, and change `:` to
`\\sim` on that line (the extra backslash is to handle string
escaping). Now try typing numbers into the input box; you should see
something like \\(1 \\sim \text{number} \\)
something like \(1 \sim \text{number} \)
{{< /bergamot_exercise >}}
One more thing. So far, we've only written down one claim: the value 1 is a number.
@ -148,8 +148,8 @@ This is exactly what is done in PL. We'd write the following.
n:\text{number}
{{< /latex >}}
What's this \\(n\\)? First, recall that notation is up to us. I'm choosing to use the letter
\\(n\\) to stand for "any value that is a number". We write a symbol, say what we want it to mean,
What's this \(n\)? First, recall that notation is up to us. I'm choosing to use the letter
\(n\) to stand for "any value that is a number". We write a symbol, say what we want it to mean,
and we're done.
{{< dialog >}}
@ -172,21 +172,21 @@ by \(n\)) the type \(\text{number}\).
{{< /dialog >}}
Actually, to be extra precise, we might want to be explicit about our claim
that \\(n\\) is a number, rather than resorting to notational conventions.
that \(n\) is a number, rather than resorting to notational conventions.
To do so, we'd need to write something like the following:
{{< latex >}}
\cfrac{n \in \texttt{Num}}{n : \text{number}}
{{< /latex >}}
Where \\(\\texttt{Num}\\) denotes the set of numbers in our syntax (`1`, `3.14`, etc.)
Where \(\texttt{Num}\) denotes the set of numbers in our syntax (`1`, `3.14`, etc.)
The stuff about the line is called a premise, and it's a simply a condition
required for the rule to hold. The rule then says that \\(n\\) has type number --
but only if \\(n\\) is a numeric symbol in our language. We'll talk about premises
required for the rule to hold. The rule then says that \(n\) has type number --
but only if \(n\) is a numeric symbol in our language. We'll talk about premises
in more detail later on.
Having introduced this variable-like thing \\(n\\), we need to be careful.
It's important to note that the letter \\(n\\) is
Having introduced this variable-like thing \(n\), we need to be careful.
It's important to note that the letter \(n\) is
not a variable like `x` in our code snippets above. In fact, it's not at all part of the programming
language we're discussing. Rather, it's kind of like a variable in our _rules_.
@ -194,26 +194,26 @@ This distinction comes up a lot. The thing is, the notation we're building up to
kind of language. It's not meant for a computer to execute, mind you, but that's not a requirement
for something to be language (ever heard of English?). The bottom line is, we have symbols with
particular meanings, and there are rules to how they have to be written. The statement "1 is a number"
must be written by first writing 1, then a colon, then \\(\text{number}\\). It's a language.
must be written by first writing 1, then a colon, then \(\text{number}\). It's a language.
Really, then, we have two languages to think about:
* The _object language_ is the programming language we're trying to describe and mathematically
formalize. This is the language that has variables like `x`, keywords like `let` and `const`, and so on.
Some examples of our object language that we've seen so far are `1` and `2+3`.
In our mathematical notation, they look like \\(1\\) and \\(2+3\\).
In our mathematical notation, they look like \(1\) and \(2+3\).
* The _meta language_ is the notation we use to talk about our object language. It consists of
the various symbols we define, and is really just a system for communicating various things
(like type rules) to others.
Expressions like \\(n \\in \\texttt{Num}\\) and \\(1 : \\text{number}\\)
Expressions like \(n \in \texttt{Num}\) and \(1 : \text{number}\)
are examples of our meta language.
Using this terminology, \\(n\\) is a variable in our meta language; this is commonly called
a _metavariable_. A rule such as \\(n:\\text{number}\\) that contains metavariables isn't
Using this terminology, \(n\) is a variable in our meta language; this is commonly called
a _metavariable_. A rule such as \(n:\text{number}\) that contains metavariables isn't
really a rule by itself; rather, it stands for a whole bunch of rules, one for each possible
number that \\(n\\) can be. We call this a _rule schema_.
number that \(n\) can be. We call this a _rule schema_.
Alright, that's enough theory for now. Let's go back to the real world. Working with
plain old values like `1` gets boring quickly. There's not many programs you can write
@ -273,9 +273,9 @@ const x: number = 1.1 + 1; // just fine!
That concludes the second round of real-world examples. Let's take a look at formalizing
all of this mathematically. As a starting point, we can look at a rule that matches the TypeScript
view of having only a single number type, \\(\\text{number}\\). This rule needs a little
view of having only a single number type, \(\text{number}\). This rule needs a little
bit "more" than the ones we've seen so far; we can't just blindly give things in the
form \\(a+b\\) the type \\(\\text{number}\\) (what if we're adding strings?). For our
form \(a+b\) the type \(\text{number}\) (what if we're adding strings?). For our
rule to behave in the way we have in mind, it's necessary for us to add _premises_.
Before I explain any further, let me show you the rule.
@ -283,7 +283,7 @@ Before I explain any further, let me show you the rule.
\frac{e_1:\text{number}\quad e_2:\text{number}}{e_1+e_2:\text{number}}
{{< /latex >}}
In the above (and elsewhere) we will use the metavariable \\(e\\) as a stand-in for
In the above (and elsewhere) we will use the metavariable \(e\) as a stand-in for
any _expression_ in our source language. In general, expressions are things such as `1`,
`x`, `1.0+someFunction(y)`, and so on. In other words, they're things we can evaluate
to a value. For the purposes of this article, though, we're only looking at basic
@ -293,12 +293,12 @@ For the moment, we will avoid rules for checking _statements_ (like `let x = 5;`
Rules like the above consist of premises (above the line) and conclusions (below the line).
The conclusion is the claim / fact that we can determine from the rule. In this specific case,
the conclusion is that \\(e_1+e_2\\) has type \\(\\text{number}\\).
the conclusion is that \(e_1+e_2\) has type \(\text{number}\).
For this to be true, however, some conditions must be met; specifically, the sub-expressions
\\(e_1\\) and \\(e_2\\) must themselves be of type \\(\\text{number}\\). These are the premises.
\(e_1\) and \(e_2\) must themselves be of type \(\text{number}\). These are the premises.
Reading in plain English, we could pronounce this rule as:
> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{number}\\), then \\(e_1+e_2\\) has type \\(\\text{number}\\).
> If \(e_1\) and \(e_2\) have type \(\text{number}\), then \(e_1+e_2\) has type \(\text{number}\).
Notice that we don't care what the left and right operands are (we say they can be any expression).
We need not concern ourselves with how to compute _their_ type in this specific rule. Thus, the rule
@ -315,7 +315,7 @@ Just to get some more practice, let's take a look at a rule for adding strings.
This rule is read as follows:
> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{string}\\), then \\(e_1+e_2\\) has type \\(\\text{string}\\).
> If \(e_1\) and \(e_2\) have type \(\text{string}\), then \(e_1+e_2\) has type \(\text{string}\).
{{< bergamot_preset name="string-preset" prompt="PromptConverter @ prompt(type(?term, ?t)) <- input(?term);" query="\"hello\"+\"world\"">}}
TNumber @ type(lit(?n), number) <- num(?n);
@ -336,15 +336,15 @@ Thus, the rule
TNumber @ type(lit(?n), number) <- num(?n);
```
Has one premise, that the term \\(n\\) is a number, and the conclusion is that
a number literal has type \\(\\text{number}\\). The `num` condition
is a Bergamot builtin, corresponding to our earlier notation of \\(n \\in \\texttt{Num}\\).
Has one premise, that the term \(n\) is a number, and the conclusion is that
a number literal has type \(\text{number}\). The `num` condition
is a Bergamot builtin, corresponding to our earlier notation of \(n \in \texttt{Num}\).
It holds for all numbers: it's always true that `num(1)`, `num(2)`,
etc. The equivalent builtin for something being a string is `str`.
To edit the rules in Bergamot, click the "Editor" button in the "Rules"
section. You will need to add two rules, just like we did for numbers:
a rule for string literals (something like \\(\\texttt{\"Hello\"} : \\text{string}\\),
a rule for string literals (something like \(\texttt{"Hello"} : \text{string}\),
but more general) and for adding two strings together. I suggest naming
these two rules `TString` and `TPlusS` respectively.
@ -353,8 +353,8 @@ expressions such as `"Hello"` and `"Hello" + "World"`.
{{< /bergamot_exercise >}}
These rules generally work in other languages. Things get more complicated in languages like Java and Rust,
where types for numbers are more precise (\\(\\text{int}\\) and \\(\\text{float}\\) instead of
\\(\\text{number}\\)). In these languages, we need rules for both.
where types for numbers are more precise (\(\text{int}\) and \(\text{float}\) instead of
\(\text{number}\)). In these languages, we need rules for both.
{{< latex >}}
\frac{e_1:\text{int}\quad e_2:\text{int}}{e_1+e_2:\text{int}}
@ -412,12 +412,12 @@ allow users to (for example) write numbers where the language expects strings, w
understanding that the number will be automatically turned into a string.
To avoid having an explosion of various rules, we instead define the "converts to"
relation, \\(\\tau_1 \\preceq \\tau_2\\), where \\(\\tau_1\\) and \\(\\tau_2\\)
relation, \(\tau_1 \preceq \tau_2\), where \(\tau_1\) and \(\tau_2\)
are types. To say that an integer can be automatically converted to a floating
pointer number, we can write \\(\\text{integer} \\preceq \\text{float}\\).
pointer number, we can write \(\text{integer} \preceq \text{float}\).
Then, we add only a single additional rule to our language: `TConverts`.
This rule says that we can treat an expression of type \\(\\tau_1\\) as
an expression of type \\(\\tau_2\\), if the former can be converted to the
This rule says that we can treat an expression of type \(\tau_1\) as
an expression of type \(\tau_2\), if the former can be converted to the
latter.
I have written some rules using these concepts. Input some expressions into
@ -439,7 +439,7 @@ post. For the moment, we shall content ourselves with the tedious approach.
Another thing to note is that we haven't yet seen rules for what programs are _incorrect_,
and we never will. When formalizing type systems we rarely (if ever) explicitly enumerate
cases that produce errors. Rather, we interpret the absence of matching rules to indicate
that something is wrong. Since no rule has premises that match \\(e_1:\\text{float}\\) and \\(e_2:\\text{string}\\),
that something is wrong. Since no rule has premises that match \(e_1:\text{float}\) and \(e_2:\text{string}\),
we can infer that
{{< sidenote "right" "float-string-note" "given the rules so far," >}}
I'm trying to be careful here, since adding a float to a string
@ -494,16 +494,16 @@ Here's a quick summary of what we've covered:
or TypeScript. The _meta language_ is the language that we use
to reason and talk about the object language. Typically, this is
the language we use for writing down our rules.
3. The common type-theoretic notation for "expression \\(x\\)
has type \\(\\tau\\)" is \\(x : \\tau\\).
3. The common type-theoretic notation for "expression \(x\)
has type \(\tau\)" is \(x : \tau\).
4. In writing more complicated rules, we will frequently make use
of the inference rule notation, which looks something like
the following.
{{< latex >}}
\frac{P_1 \quad P_2 \quad ... \quad P_n}{P}
{{< /latex >}}
The above is read as "if \\(P_1\\) through \\(P_n\\) are
true, then \\(P\\) is also true."
The above is read as "if \(P_1\) through \(P_n\) are
true, then \(P\) is also true."
5. To support operators like `+` that can work on, say, both numbers
and strings, we provide inference rules for each such case. If this
gets cumbersome, we can introduce a system of _subtypes_ into our
@ -539,9 +539,9 @@ and already be up-to-speed on a big chunk of the content.
#### Metavariables
| Symbol | Meaning | Syntactic Category |
|---------|--------------|-----------------------|
| \\(n\\) | Numbers | \\(\\texttt{Num}\\) |
| \\(s\\) | Strings | \\(\\texttt{Str}\\) |
| \\(e\\) | Expressions | \\(\\texttt{Expr}\\) |
| \(n\) | Numbers | \(\texttt{Num}\) |
| \(s\) | Strings | \(\texttt{Str}\) |
| \(e\) | Expressions | \(\texttt{Expr}\) |
#### Grammar
{{< block >}}
@ -558,8 +558,8 @@ and already be up-to-speed on a big chunk of the content.
{{< foldtable >}}
| Rule | Description |
|--------------|-------------|
| {{< latex >}}\frac{n \in \texttt{Num}}{n : \text{number}} {{< /latex >}}| Number literals have type \\(\\text{number}\\) |
| {{< latex >}}\frac{s \in \texttt{Str}}{s : \text{string}} {{< /latex >}}| String literals have type \\(\\text{string}\\) |
| {{< latex >}}\frac{n \in \texttt{Num}}{n : \text{number}} {{< /latex >}}| Number literals have type \(\text{number}\) |
| {{< latex >}}\frac{s \in \texttt{Str}}{s : \text{string}} {{< /latex >}}| String literals have type \(\text{string}\) |
| {{< latex >}}\frac{e_1 : \text{string}\quad e_2 : \text{string}}{e_1+e_2 : \text{string}} {{< /latex >}}| Adding strings gives a string |
| {{< latex >}}\frac{e_1 : \text{number}\quad e_2 : \text{number}}{e_1+e_2 : \text{number}} {{< /latex >}}| Adding numbers gives a number |