Switch more posts to work with KaTeX and the latex macro
continuous-integration/drone/push Build is passing Details

This commit is contained in:
Danila Fedorin 2020-03-04 14:07:05 -08:00
parent 80d722568e
commit 67ecc741d0
5 changed files with 174 additions and 123 deletions

View File

@ -152,18 +152,21 @@ the rules in detail in paragraphs of text.
Let's start with inference rules. An inference rule is an expression in
the form:
$$
\\frac{A\_1 \\ldots A\_n} {B\_1 \\ldots B\_m}
$$
{{< latex >}}
\frac{A_1 \ldots A_n} {B_1 \ldots B_m}
{{< /latex >}}
This reads, "given that the premises \\(A\_1\\) through \\(A\_n\\) are true,
it holds that the conclusions \\(B\_1\\) through \\(B\_m\\) are true".
For example, we can have the following inference rule:
$$
\\frac
{\\text{if it's cold, I wear a jacket} \\quad \\text{it's cold}}
{\\text{I wear a jacket}}
$$
{{< latex >}}
\frac
{\text{if it's cold, I wear a jacket} \quad \text{it's cold}}
{\text{I wear a jacket}}
{{< /latex >}}
Since you wear a jacket when it's cold, and it's cold, we can conclude
that you will wear a jacket.
@ -182,20 +185,22 @@ Alright, this is enough to get us started with some typing rules.
Let's start with one for numbers. If we define \\(n\\) to mean
"any expression that is a just a number, like 3, 2, 6, etc.",
we can write the typing rule as follows:
$$
\\frac{}{n : \\text{Int}}
$$
{{< latex >}}
\frac{}{n : \text{Int}}
{{< /latex >}}
There's nothing above the line -
there are no premises that are needed for a number to
have the type `Int`.
Now, let's move on to the rule for function application:
$$
\\frac
{e_1 : \\tau\_1 \\rightarrow \\tau\_2 \\quad e_2 : \\tau_1}
{e_1 \\; e_2 : \\tau\_2}
$$
{{< latex >}}
\frac
{e_1 : \tau_1 \rightarrow \tau_2 \quad e_2 : \tau_1}
{e_1 \; e_2 : \tau_2}
{{< /latex >}}
This rule includes everything we've said before:
the thing being applied has to have a function type
@ -227,21 +232,24 @@ has the type \\(\\tau\\).
Let's update our rules with this new addition.
The integer rule just needs a slight adjustment:
$$
\frac{}{\\Gamma \\vdash n : \\text{Int}}
$$
{{< latex >}}
\frac{}{\Gamma \vdash n : \text{Int}}
{{< /latex >}}
The same is true for the application rule:
$$
{{< latex >}}
\frac
{\\Gamma \\vdash e_1 : \\tau\_1 \\rightarrow \\tau\_2 \\quad \\Gamma \\vdash e_2 : \\tau_1}
{\\Gamma \\vdash e_1 \\; e_2 : \\tau\_2}
$$
{\Gamma \vdash e_1 : \tau_1 \rightarrow \tau_2 \quad \Gamma \vdash e_2 : \tau_1}
{\Gamma \vdash e_1 \; e_2 : \tau_2}
{{< /latex >}}
And finally, we can represent the variable rule:
$$
\\frac{x : \\tau \\in \\Gamma}{\\Gamma \\vdash x : \\tau}
$$
{{< latex >}}
\frac{x : \tau \in \Gamma}{\Gamma \vdash x : \tau}
{{< /latex >}}
In these three expressions, we've captured all of the rules
that we've seen so far. It's important to know
@ -263,11 +271,13 @@ where we will introduce new variables into the context, and
also a place where we will need several rules.
Let's first take a look at the whole case expression rule:
$$
\\frac
{\\Gamma \\vdash e : \\tau \\quad \\text{matcht}(\\tau, p\_i) = b\_i \\quad \\Gamma,b\_i \\vdash e\_i : \\tau\_c}
{\\Gamma \\vdash \\text{case} \\; e \\; \\text{of} \; \\\{ (p\_1,e\_1) \\ldots (p\_n, e\_n) \\\} : \\tau\_c }
$$
{{< latex >}}
\frac
{\Gamma \vdash e : \tau \quad \text{matcht}(\tau, p_i) = b_i \quad \Gamma,b_i \vdash e_i : \tau_c}
{\Gamma \vdash \text{case} \; e \; \text{of} \; \{ (p_1,e_1) \ldots (p_n, e_n) \} : \tau_c }
{{< /latex >}}
This is a lot more complicated than the other rules we've seen, and we've used some notation
that we haven't seen before. Let's take this step by step:
@ -284,18 +294,20 @@ the "basic" pattern, which always matches the value and binds a variable to it,
for a constructor pattern, that matches a constructor and its parameters.
Let's define \\(v\\) to be a variable name in the context of a pattern. For the basic pattern:
$$
\\frac
{{< latex >}}
\frac
{}
{\\text{matcht}(\\tau, v) = \\\{v : \\tau \\\}}
$$
{\text{matcht}(\tau, v) = \{v : \tau \}}
{{< /latex >}}
For the next rule, let's define \\(c\\) to be a constructor name. The rule for the constructor pattern, then, is:
$$
\\frac
{\\Gamma \\vdash c : \\tau\_1 \\rightarrow ... \\rightarrow \\tau\_n \\rightarrow \\tau}
{\\text{matcht}(\\tau, c \; v\_1 ... v\_n) = \\{ v\_1 : \\tau\_1, ..., v\_n : \\tau\_n \\}}
$$
{{< latex >}}
\frac
{\Gamma \vdash c : \tau_1 \rightarrow ... \rightarrow \tau_n \rightarrow \tau}
{\text{matcht}(\tau, c \; v_1 ... v_n) = \{ v_1 : \tau_1, ..., v_n : \tau_n \}}
{{< /latex >}}
This rule means that whenever we have a pattern in the form of a constructor applied to
\\(n\\) variable names, if the constructor takes \\(n\\) arguments of types \\(\\tau\_1\\)

View File

@ -277,14 +277,14 @@ order to conserve space, let's use \\(\\text{G}\\) for PushGlobal,
Let's say we want to construct a graph for `double 326`. We'll
use the instructions \\(\\text{I} \; 326\\), \\(\\text{G} \; \\text{double}\\),
and \\(\\text{A}\\). Let's watch these instructions play out:
$$
\\begin{align}
[\\text{I} \; 326, \\text{G} \; \text{double}, \\text{A}] & \\quad s \\quad & d \\quad & h \\quad & m[\\text{double} : a\_d] \\\\\\
[\\text{G} \; \text{double}, \\text{A}] & \\quad a\_1 : s \\quad & d \\quad & h[a\_1 : \\text{NInt} \; 326] \\quad & m[\\text{double} : a\_d] \\\\\\
[\\text{A}] & \\quad a\_d, a\_1 : s \\quad & d \\quad & h[a\_1 : \\text{NInt} \; 326] \\quad & m[\\text{double} : a\_d] \\\\\\
[] & \\quad a\_2 : s \\quad & d \\quad & h[\\substack{\\begin{aligned}a\_1 & : \\text{NInt} \; 326 \\\\ a\_2 & : \\text{NApp} \; a\_d \; a\_1 \\end{aligned}}] \\quad & m[\\text{double} : a\_d] \\\\\\
\\end{align}
$$
{{< latex >}}
\begin{aligned}
[\text{I} \; 326, \text{G} \; \text{double}, \text{A}] & \quad s \quad & d \quad & h \quad & m[\text{double} : a_d] \\
[\text{G} \; \text{double}, \text{A}] & \quad a_1 : s \quad & d \quad & h[a_1 : \text{NInt} \; 326] \quad & m[\text{double} : a_d] \\
[\text{A}] & \quad a_d, a_1 : s \quad & d \quad & h[a_1 : \text{NInt} \; 326] \quad & m[\text{double} : a_d] \\
[] & \quad a_2 : s \quad & d \quad & h[\substack{\begin{aligned}a_1 & : \text{NInt} \; 326 \\ a_2 & : \text{NApp} \; a_d \; a_1 \end{aligned}}] \quad & m[\text{double} : a_d] \\
\end{aligned}
{{< /latex >}}
How did we come up with these instructions? We'll answer this question with
more generality later, but let's take a look at this particular expression
right now. We know it's an application, so we'll be using MkApp eventually.

View File

@ -17,9 +17,10 @@ compiles to the instructions \\(i\\)".
To follow our route from the typechecking, let's start
with compiling expressions that are numbers. It's pretty easy:
$$
\\mathcal{C} ⟦n⟧ = [\\text{PushInt} \\; n]
$$
{{< latex >}}
\mathcal{C} ⟦n⟧ = [\text{PushInt} \; n]
{{< /latex >}}
Here, we compiled a number expression to a list of
instructions with only one element - PushInt.
@ -30,9 +31,9 @@ we informally stated in the previous chapter, since
the thing we're applying has to be on top,
we want to compile it last:
$$
\\mathcal{C} ⟦e\_1 \\; e\_2⟧ = \\mathcal{C} ⟦e\_2⟧ ⧺ \\mathcal{C} ⟦e\_1⟧ ⧺ [\\text{MkApp}]
$$
{{< latex >}}
\mathcal{C} ⟦e_1 \; e_2⟧ = \mathcal{C} ⟦e_2⟧ ⧺ \mathcal{C} ⟦e_1⟧ ⧺ [\text{MkApp}]
{{< /latex >}}
Here, we used the \\(⧺\\) operator to represent the concatenation of two
lists. Otherwise, this should be pretty intutive - we first run the instructions
@ -59,14 +60,15 @@ but their addresses are incremented by \\(n\\)". We now pass \\(\\rho\\)
in to \\(\\mathcal{C}\\) together with the expression \\(e\\). Let's
rewrite our first two rules. For numbers:
$$
\\mathcal{C} ⟦n⟧ \\; \\rho = [\\text{PushInt} \\; n]
$$
{{< latex >}}
\mathcal{C} ⟦n⟧ \; \rho = [\text{PushInt} \; n]
{{< /latex >}}
For function application:
$$
\\mathcal{C} ⟦e\_1 \\; e\_2⟧ \\; \\rho = \\mathcal{C} ⟦e\_2⟧ \\; \\rho ⧺ \\mathcal{C} ⟦e\_1⟧ \\; \\rho^{+1} ⧺ [\\text{MkApp}]
$$
{{< latex >}}
\mathcal{C} ⟦e_1 \; e_2⟧ \; \rho = \mathcal{C} ⟦e_2⟧ \; \rho \; ⧺ \;\mathcal{C} ⟦e_1⟧ \; \rho^{+1} \; ⧺ \; [\text{MkApp}]
{{< /latex >}}
Notice how in that last rule, we passed in \\(\\rho^{+1}\\) when compiling the function's expression. This is because
the result of running the instructions for \\(e\_2\\) will have left on the stack the function's parameter. Whatever
@ -74,17 +76,18 @@ was at the top of the stack (and thus, had index 0), is now the second element f
same is true for all other things that were on the stack. So, we increment the environment accordingly.
With the environment, the variable rule is simple:
$$
\\mathcal{C} ⟦x⟧ \\; \\rho = [\\text{Push} \\; (\\rho \\; x)]
$$
{{< latex >}}
\mathcal{C} ⟦x⟧ \; \rho = [\text{Push} \; (\rho \; x)]
{{< /latex >}}
One more thing. If we run across a function name, we want to
use PushGlobal rather than Push. Defining \\(f\\) to be a name
of a global function, we capture this using the following rule:
$$
\\mathcal{C} ⟦f⟧ \\; \\rho = [\\text{PushGlobal} \\; f]
$$
{{< latex >}}
\mathcal{C} ⟦f⟧ \; \rho = [\text{PushGlobal} \; f]
{{< /latex >}}
Now it's time for us to compile case expressions, but there's a bit of
an issue - our case expressions branches don't map one-to-one with
@ -118,13 +121,13 @@ It's helpful to define compiling a single branch of a case expression
separately. For a branch in the form \\(t \\; x\_1 \\; x\_2 \\; ... \\; x\_n \\rightarrow \text{body}\\),
we define a compilation scheme \\(\\mathcal{A}\\) as follows:
$$
\\begin{align}
\\mathcal{A} ⟦t \\; x\_1 \\; ... \\; x\_n \\rightarrow \text{body}⟧ \\; \\rho & =
t \\rightarrow [\\text{Split} \\; n] \\; ⧺ \\; \\mathcal{C}⟦\\text{body}⟧ \\; \\rho' \\; ⧺ \\; [\\text{Slide} \\; n] \\\\\\
\text{where} \\; \\rho' &= \\rho^{+n}[x\_1 \\rightarrow 0, ..., x\_n \\rightarrow n - 1]
\\end{align}
$$
{{< latex >}}
\begin{aligned}
\mathcal{A} ⟦t \; x_1 \; ... \; x_n \rightarrow \text{body}⟧ \; \rho & =
t \rightarrow [\text{Split} \; n] \; ⧺ \; \mathcal{C}⟦\text{body}⟧ \; \rho' \; ⧺ \; [\text{Slide} \; n] \\
\text{where} \; \rho' &= \rho^{+n}[x_1 \rightarrow 0, ..., x_n \rightarrow n - 1]
\end{aligned}
{{< /latex >}}
First, we run Split - the node on the top of the stack is a packed constructor,
and we want access to its member variables, since they can be referenced by
@ -145,10 +148,10 @@ Next, we want to evaluate it (since we need a packed value, not a graph,
to read the tag). Finally, we perform a jump depending on the tag. This
is captured by the following rule:
$$
\\mathcal{C} ⟦\\text{case} \\; e \\; \\text{of} \\; \\text{alt}_1 ... \\text{alt}_n⟧ \\; \\rho =
\\mathcal{C} ⟦e⟧ \\; \\rho \\; ⧺ [\\text{Eval}, \\text{Jump} \\; [\\mathcal{A} ⟦\\text{alt}_1⟧ \; \\rho, ..., \\mathcal{A} ⟦\\text{alt}_n⟧ \; \\rho]]
$$
{{< latex >}}
\mathcal{C} ⟦\text{case} \; e \; \text{of} \; \text{alt}_1 ... \text{alt}_n⟧ \; \rho =
\mathcal{C} ⟦e⟧ \; \rho \; ⧺ [\text{Eval}, \text{Jump} \; [\mathcal{A} ⟦\text{alt}_1⟧ \; \rho, ..., \mathcal{A} ⟦\text{alt}_n⟧ \; \rho]]
{{< /latex >}}
This works because \\(\\mathcal{A}\\) creates not only instructions,
but also a tag mapping. We simply populate our Jump instruction such mappings

View File

@ -36,12 +36,13 @@ if (if True False True) 11 3
This is because, for this to work, both of the following would need to hold (borrowing
some of our notation from the [typechecking]({{< relref "03_compiler_typechecking.md" >}}) post):
$$
\\text{if} : \\text{Int} \\rightarrow \\text{Int}
$$
$$
\\text{if} : \\text{Bool} \\rightarrow \\text{Bool}
$$
{{< latex >}}
\text{if} : \text{Int} \rightarrow \text{Int}
{{< /latex >}}
{{< latex >}}
\text{if} : \text{Bool} \rightarrow \text{Bool}
{{< /latex >}}
But using our rules so far, such a thing is impossible, since there is no way for
\\(\text{Int}\\) to be unified with \\(\text{Bool}\\). We need a more powerful

View File

@ -29,50 +29,69 @@ add l r = l + r
```
Lambda calculus is all about abstraction into functions, and the application of these functions. In lambda calculus, abstraction looks like this:
$$
{{< latex >}}
\lambda \ x \ . \ t
$$
{{< /latex >}}
This reads: a function that, when given a variable `x`, evaluates to `t`. Usually, `t` is an expression using `x`. For example, suppose that there exists a function `double`. If we wanted to create a function that multiplies a number by 4, we can write it as follows:
$$
{{< latex >}}
\lambda \ x \ . \ \text{double} \ (\text{double} \ x)
$$
{{< /latex >}}
Here, we see function application: the function `double`, which takes a number and returns a number, is given the parameter `x`. The result of evaluating `double x` is then
given as a parameter to `double` again, thereby multiplying `x` by 4. Let's use this new fuction to multiply some number y by 4:
$$
{{< latex >}}
(\lambda \ x . \ \text{double} \ (\text{double} \ x)) \ y
$$
{{< /latex >}}
Here, we see both abstraction and application: we abstract the process of multiplying a number by 4, and then apply that process to a variable y. Application works by simply replacing the variable before the dot in the abstraction with the value that's being passed in. This is called __binding__. In our case, we're binding the variable `x` to the value of `y`. This results in the following:
$$
{{< latex >}}
\text{double} \ (\text{double} \ y)
$$
{{< /latex >}}
Since we have no idea what the body (the expression after the dot) of `double` looks like, we cannot simplify this any further.
### Currying
This is nice and all, but what about functions of more than one variable? What if we want to add numbers, like in our original example of abstraction? If lambda abstraction has only one variable on the left of the dot, how can we represent such functions? Surely there has to be a way, since we claim this little language can represent any computation that can be done on a computer. The answer is fairly simple. We use abstraction twice:
$$
{{< latex >}}
\lambda \ x \ . \ \lambda \ y \ . \ x + y
$$
{{< /latex >}}
The second abstraction is inside the first. It can be equivalently written:
$$
{{< latex >}}
\lambda \ x \ . \ (\lambda \ y \ . \ x + y)
$$
{{< /latex >}}
I've done something here that you might've accepted without questioning, but that I can't ignore without feeling guilty. I've assumed that our lambda calculus has numbers and addition. This __isn't needed__ for the language to be Turing complete. However, it makes for much nicer examples, so we'll stick with it here. Let's move on to an example of using our function to add two numbers:
$$
{{< latex >}}
(\lambda \ x \ . \ \lambda \ y \ . \ x + y) \ 1 \ 2
$$
{{< /latex >}}
Function application is left associative. This is equivalently written as:
$$
{{< latex >}}
((\lambda \ x \ . \ \lambda \ y \ . \ x + y) \ 1) \ 2
$$
{{< /latex >}}
First, we bind `x` to 1:
$$
{{< latex >}}
(\lambda \ y \ . \ 1 + y) \ 2
$$
{{< /latex >}}
Then, we bind `y` in the resulting expression to 2:
$$
{{< latex >}}
1 + 2
$$
{{< /latex >}}
Well, there it is. We can extend this logic to functions of 3, 4, and more variables.
### Partial Function Application
@ -81,30 +100,38 @@ Our way of defining functions of more than one variable leads to a curious effec
add(1);
```
Oh no! The compiler won't like this; `add` takes two parameters, and we've only given it one. However, when we try to do the same thing in lambda calculus:
$$
{{< latex >}}
(\lambda \ x \ . \ \lambda \ y \ . \ x + y) 1
$$
{{< /latex >}}
As before, we can bind `x` to 1:
$$
{{< latex >}}
\lambda \ y \ . \ 1 + y
$$
{{< /latex >}}
This is not an error! All we've done is produce a function that takes the remaining parameter, `y`, and adds it to the parameter we've already passed in. This is an example of a partially applied function. Effectively, this new function we've produce is waiting for the remaining parameters we haven't yet given it. Once we do, it'll behave just like it would've if we passed all the parameters in together.
### First Class Functions
So far, we've only been applying our functions to numbers. But it doesn't have to be this way! Functions can also serve as parameters to other functions. For example, we can imagine a function that takes another function `f`, which transforms a number somehow, and then applies it to a number two times. The first parameter of such a function would be `f`, and the other would be some number `x`. In the body, we will apply `f` to the result of applying `f` to `x`:
$$
{{< latex >}}
\lambda \ f \ . \ \lambda \ x \ . \ f (f \ x)
$$
{{< /latex >}}
This might look kind of familiar! Let's apply this function to our previously mentioned `double` function:
$$
{{< latex >}}
(\lambda \ f . \ \lambda \ x \ . \ f (f \ x)) \ double
$$
{{< /latex >}}
Just like in our previous examples, we simply replace `f` with `double` during application:
$$
{{< latex >}}
\lambda \ x \ . double \ (double \ x)
$$
{{< /latex >}}
We've now created our multiply-by-four function! We can create other functions the same way. For instance, if we had a function `halve`, we could create a function to divide a number by 4 by applying our "apply-twice" function to it.
### Church Encoded Integers
@ -112,26 +139,34 @@ We can represent numbers using just abstraction and application. This is called
Now, let's try represent addition. Addition of two numbers `a` and `b` would be done by taking a function `f` and applying it the first number of times, and then applying it the second number more times. Since addition must take in numbers `a` and `b`, which are functions of two variables, and return a number, we will end up with
something like this:
$$
{{< latex >}}
\lambda \ a \ . \ \lambda \ b \ . \ ??
$$
{{< /latex >}}
What goes in the body of the second lambda? We know that a number is a function which takes a function and a variable to apply the function to:
$$
{{< latex >}}
\lambda \ f \ . \ \lambda \ v \ . \ ??
$$
{{< /latex >}}
Since the addition of a and b must return a number, we then have the following:
$$
{{< latex >}}
\lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ ??
$$
{{< /latex >}}
So how do we apply `f` the first number of times, and then apply it the second number more times? Well, `a f` is a partially applied function that, when given a variable, will apply `f` to that variable the first number of times. Similarly, `b f` is a partially applied function that will apply `f` the second number of times. We feed `v` into `a f`, which applies `f` the first number of times and returns the result. We then feed that into `b f`. This results in:
$$
{{< latex >}}
\lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ b \ f \ (a \ f \ v)
$$
{{< /latex >}}
Phew! Now let's try multiplication. The product of two numbers would apply the function the first number of times, the second number of times. As we already know, `a f` will get us a function that fulfills the first part of this requirement. Then we can apply _that function_ the second number of times, which would give us what we want:
$$
{{< latex >}}
\lambda \ a \ . \ \lambda \ b \ . \ \lambda \ f \ . \ \lambda \ v \ . \ b \ (a \ f) \ v
$$
{{< /latex >}}
Neat!