183 lines
9.9 KiB
Markdown
183 lines
9.9 KiB
Markdown
---
|
|
title: "Everything I Know About Types: Introduction"
|
|
date: 2022-06-26T18:36:01-07:00
|
|
tags: ["Type Systems", "Programming Languages"]
|
|
series: "Everything I Know About Types"
|
|
draft: true
|
|
---
|
|
|
|
{{< katex_component_js >}}
|
|
{{< bergamot_js_css >}}
|
|
|
|
I am in love with types and type systems. They are, quite probably,
|
|
my favorite concept in the world. Most of us mere
|
|
mortals use types as a way to make sure we aren't writing
|
|
bad code - they help us catch things that are certainly wrong,
|
|
such as dividing a string by a number, or calling a method
|
|
on an object that doesn't exist. Types, however, can do
|
|
a lot more than that. In languages with particularly
|
|
powerful type systems, like Idris or Coq, you can prove
|
|
facts about your program, confirming, for instance,
|
|
that a function doesn't loop infinitely and eventually
|
|
returns a result. More generally, some powerful systems
|
|
can be used in place of Zermelo-Frankel set theory as
|
|
foundations for _all of mathematics_. There's a lot of depth to types.
|
|
|
|
A lot of the fascinating ideas in type systems are contained
|
|
in academic writing, rather than "real-world" programming
|
|
languages. This, unfortunately, makes them harder to access -
|
|
the field of programming languages isn't exactly known
|
|
for being friendly to newbies. I want to introduce these fascinating
|
|
ideas to you, and I want to prove to you that these ideas are _not_
|
|
"academic autofellatio" -- they have practical use. At the
|
|
same time, I want to give you the tools to explore programming
|
|
languages literature yourself. The notation in the field seemed
|
|
to me to be scary and overwhelming at first, but nowadays my eyes
|
|
go straight to the equations in PL papers. It's not that scary
|
|
once you get used to it, and it really does save a lot of ambiguity.
|
|
|
|
So I am going to try to opt for a hybrid approach. My aim is to explore
|
|
various ideas in programming languages, starting from the absolute
|
|
basics and working my way up to the limits of what I know. I will
|
|
try to motivate each exploration with examples from the real world,
|
|
using well-known typed languages, perhaps TypeScript, Java, C++, and Haskell.
|
|
At the same time, I will build up the tools to describe type systems
|
|
mathematically, the way it's done in papers.
|
|
|
|
Before all of that, though, I want to go slightly deeper into the
|
|
motivation behind type systems, and some reasons why I find them
|
|
so interesting.
|
|
|
|
### The Impossibility of Knowing
|
|
Wouldn't it be nice if there was a programming language that could
|
|
know, always and with complete certainty, if you wrote bad code?
|
|
Maybe it screams at you, like the [skull of regret](https://imgur.com/gallery/vQm8O).
|
|
Forgot to increment a loop counter? _AAAA!_ Flipped the arguments to a function?
|
|
_AAAA!_ Caused an infinite loop? _AAAAAAA!_ No mistake escapes its all-seeing
|
|
eye, and if your code compiles, it's always completely correct. You
|
|
might think that this compiler only exists in your mind, but guess what? You're right.
|
|
|
|
The reason, broadly speaking, is [Rice's theorem](https://en.wikipedia.org/wiki/Rice%27s_theorem).
|
|
It states that any interesting question about a program's behavior is impossible to answer.
|
|
We can't determine for sure whether or not any program will run infinitely, or if it will
|
|
throw an exception, or do something silly. It's just not possible. We're out of luck.
|
|
|
|
In software, though, there's a bit of discrepancy between theory and practice. They
|
|
say that computer scientists love dismissing problems as NP-hard, while working
|
|
software engineers write code that, while not fast in the general case, is perfectly
|
|
adequate in the real world. What I'm trying to say is: a theoretical restriction,
|
|
no matter how stringent, shouldn't stop us from trying to find something "good enough in practice".
|
|
|
|
This is where types come in. They are decidedly not magic - you can't do anything
|
|
with a type system that violates Rice's theorem. And yet, modern type systems
|
|
can reject a huge number of incorrect programs, and at the same time allow
|
|
most "correct-looking" programs through. They can't eliminate _all_ errors,
|
|
but they can eliminate _a lot_ of them.
|
|
|
|
Moreover, this sort of checking is _cheap_. For example, [Algorithm J](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_J),
|
|
which is used to typecheck polymorphic functional programs, has a "close to linear"
|
|
complexity in terms of program size. This is a huge jump, taking us from "undecidable" to
|
|
"efficient". Once again, approximations prevail. What we have is a fast and
|
|
practical way of catching our mistakes.
|
|
|
|
### Types for Documentation
|
|
To me, type systems aren't only about checking if your program is correct;
|
|
they're also about documentation. This way of viewing programs comes to
|
|
me from experience with Haskell. In Haskell, the compiler is usually able to
|
|
check your program without you ever having to write down any types. You
|
|
can write your programs, like `length xs + sum xs`, and feel assured that what
|
|
you wrote isn't complete nonsense. And yet, most Haskell programmers
|
|
(myself included) _do_ write the types for all of their functions, by hand.
|
|
How come?
|
|
|
|
The answer is that types can tell you what your code does. Even nontrivial functions
|
|
can often have types that fit on a single line (and that are quite easy to read).
|
|
For example, given a function `typecheck`, you could try figure out what it returns by reading
|
|
the code, or by meditating on its name. Or, you could read its type:
|
|
|
|
```Haskell
|
|
typecheck :: Expr -> Either Error Type
|
|
```
|
|
|
|
This can be read "given an expression, `typecheck` returns either an error or a type".
|
|
Now you know what to expect when calling the function, what arguments it needs,
|
|
and you have some idea of what it does. You get all this without ever looking
|
|
at the code. As another example, the function `partition` has the following type:
|
|
|
|
```Haskell
|
|
partition :: (a -> Bool) -> [a] -> ([a], [a])
|
|
```
|
|
|
|
This one reads, "given a function that returns true or false, and a list, `partition` returns two lists".
|
|
It's not too hard to infer that `partition` divides the given list into two, one list
|
|
for items that made the input function true, and one for those that made it false. I say this
|
|
without having read a word of `partition`'s documentation, and having never seen its code.
|
|
The type is enough to convey that information.
|
|
|
|
There's more. Types, when viewed in this way, don't just help you understand code you already
|
|
have. They can also be used to _find_ code that does what you want. This is the idea behind
|
|
[Hoogle](https://hoogle.haskell.org/). Here's an example.
|
|
|
|
Suppose I have a list of people's middle names. A middle name is just a string (`String` in Haskell).
|
|
Not everyone has a middle name, though. To represent a value that may or may not be present,
|
|
we can use the `Maybe` type constructor. A list of people's middle names, then, would have
|
|
the type `[Maybe String]`, a list of (potentially absent) strings. Now, suppose I wanted to take
|
|
this list, and filter out all of the absent middle names, resulting in just a list of strings, `[String]`.
|
|
Do I have to do this myself, or is there a function for it? I put a type signature into Hoogle:
|
|
|
|
```Haskell
|
|
[Maybe String] -> [String]
|
|
```
|
|
|
|
Hoogle responds with a function that fits the search, in the `Data.Maybe` module:
|
|
|
|
```Haskell
|
|
catMaybes : [Maybe a] -> [a]
|
|
```
|
|
|
|
Looks like I don't have to implement it myself after all! When working with Haskell,
|
|
I consult Hoogle a lot of figure out the names of existing functions that match my needs.
|
|
|
|
I singled out Haskell in this section, but types in a variety of other languages
|
|
make for great documentation. When I wrote Rust code, the types were very useful to
|
|
understanding how to use a crate; check out the
|
|
[documentation page for `Vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html), for instance.
|
|
Documentation of Elm packages always lists functions' types (see [documentation for `Parser`](https://package.elm-lang.org/packages/elm/parser/latest/Parser), for example). Even C++ type signatures
|
|
listed by Doxygen can be quite useful; I, for one, got a lot out of the [LLVM documentation](https://llvm.org/doxygen/classllvm_1_1IRBuilderBase.html).
|
|
|
|
### Exercises, Bergamot and You
|
|
One of the reasons I love the [Software Foundations](https://softwarefoundations.cis.upenn.edu/)
|
|
series are the exercises. They are within the text, and they are machine-checked:
|
|
if you use a computer tool to work through the tasks, and verify that you did
|
|
them correctly. I hope to do something similar. Exercises will look something
|
|
like this:
|
|
|
|
{{< bergamot_exercise label="sample exercise" id="exercise-1" >}}
|
|
Here I tell you to do something that I believe would be instructive.
|
|
{{< /bergamot_exercise >}}
|
|
|
|
To achieve my ideal of interactive exercises, I developed a tool called Bergamot.
|
|
It's a tiny little programming language for writing _inference rules_, which are an invaluable tool
|
|
in a type theorist's toolbox. I introduced the tool in [a separate post on this
|
|
site]({{< relref "bergamot" >}}). Throughout this series, I'll be using Bergamot
|
|
for exercises and mild amounts of interactive content. This is completely optional:
|
|
My aim is to make everything I write self-contained and useful without various
|
|
tools. However, I think that having a way to interactively play with inference
|
|
rules is conducive to learning the concepts.
|
|
|
|
The unfortunate problem with making a tool for exercises is that you also need
|
|
to teach others how to use the tool. Some exercises will be more specific
|
|
to the tool than to type theory itself; I will denote these exercises as such where
|
|
possible. Also, whenever the context of the exercise can be loaded into
|
|
Bergamot, I will denote this with a play button.
|
|
|
|
{{< bergamot_preset name="intro-preset" prompt="PromptConverter @ prompt(type(?term, ?t)) <- input(?term);" >}}
|
|
TNumber @ type(lit(?n), number) <- num(?n);
|
|
TPlusI @ type(plus(?e_1, ?e_2), number) <-
|
|
type(?e_1, number), type(?e_2, number);
|
|
{{< /bergamot_preset >}}
|
|
|
|
{{< bergamot_exercise label="bergamot; sample exercise" preset="intro-preset" id="exercise-2" >}}
|
|
Try typing `1+1` into the input field!
|
|
{{< /bergamot_exercise >}}
|