|
|
@ -246,7 +246,7 @@ The following (type-correct) program compiles just fine: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 36 47 >}} |
|
|
|
|
|
|
|
First, it loads a boolean (`True`, to be exact) into register `A`; then, |
|
|
|
First, it loads a boolean into register `A`; then, |
|
|
|
inside the `if/else` statement, it stores an integer into `A`. Finally, |
|
|
|
it stores another integer into `B`, and adds them into `R`. Even though |
|
|
|
`A` was a boolean at first, the type checker can deduce that it |
|
|
@ -307,3 +307,173 @@ Type mismatch between IntTy and BoolTy |
|
|
|
And so, we have an encoding of our language that allows registers to |
|
|
|
be either integers or booleans, while still preventing |
|
|
|
type-incorrect programs! |
|
|
|
|
|
|
|
### Building an Interpreter |
|
|
|
A good test of such an encoding is the implementation |
|
|
|
of an interpreter. It should be possible to convince the |
|
|
|
typechecker that our interpreter doesn't need to |
|
|
|
handle type errors in the toy language, since they |
|
|
|
cannot occur. |
|
|
|
|
|
|
|
Let's start with something simple. First of all, suppose |
|
|
|
we have an expression of type `Expr InTy`. In our toy |
|
|
|
language, it produces an integer. Our interpreter, then, |
|
|
|
will probably want to use Idris' type `Int`. Similarly, |
|
|
|
an expression of type `Expr BoolTy` will produce a boolean |
|
|
|
in our toy language, which in Idris we can represent using |
|
|
|
the built-in `Bool` type. Despite the similar naming, though, |
|
|
|
there's no connection between Idris' `Bool` and our own `BoolTy`. |
|
|
|
We need to define a conversion from our own types -- which are |
|
|
|
values of type `Ty` -- into Idris types that result from evaluating |
|
|
|
expressions. We do so as follows: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 63 65 >}} |
|
|
|
|
|
|
|
Similarly, we want to convert our `TypeState` (a tuple describing the _types_ |
|
|
|
of our registers) into a tuple that actually holds the values of each |
|
|
|
register, which we will call `State`. The value of each register at |
|
|
|
any point depends on its type. My first thought was to define |
|
|
|
`State` as a function from `TypeState` to an Idris `Type`: |
|
|
|
|
|
|
|
```Idris |
|
|
|
State : TypeState -> Type |
|
|
|
State (a, b, c) = (repr a, repr b, repr c) |
|
|
|
``` |
|
|
|
|
|
|
|
Unfortunately, this doesn't quite cut it. The problem is that this |
|
|
|
function technically doesn't give Idris any guarantees that `State` |
|
|
|
will be a tuple. The most Idris knows is that `State` will be some |
|
|
|
`Type`, which could be `Int`, `Bool`, or anything else! This |
|
|
|
becomes a problem when we try to pattern match on states to get |
|
|
|
the contents of a particular register. Instead, we have to define |
|
|
|
a new data type: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 67 68 >}} |
|
|
|
|
|
|
|
In this snippet, `State` is still a (type level) function from `TypeState` to `Type`. |
|
|
|
However, by using a GADT, we guarantee that there's only one way to construct |
|
|
|
a `State (a,b,c)`: using a corresponding tuple. Now, Idris will accept our |
|
|
|
pattern matching: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 70 78 >}} |
|
|
|
|
|
|
|
The `getReg` function will take out the value of the corresponding |
|
|
|
register, returning `Int` or `Bool` depending on the `TypeState`. |
|
|
|
What's important is that if the `TypeState` is known, then so |
|
|
|
is the type of `getReg`: no `Either` is involved her, and we |
|
|
|
can directly use the integer or boolean stored in the |
|
|
|
register. This is exactly what we do: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 80 85 >}} |
|
|
|
|
|
|
|
This is pretty concise. Idris knows that `Lit i` is of type `Expr IntTy`, |
|
|
|
and it knows that `repr IntTy = Int`, so it also knows that |
|
|
|
`eval (Lit i)` produces an `Int`. Similarly, we wrote |
|
|
|
`Reg r` to have type `Expr s (getRegTy r s)`. Since `getReg` |
|
|
|
returns `repr (getRegTy r s)`, things check out here, too. |
|
|
|
A similar logic applies to the rest of the cases. |
|
|
|
|
|
|
|
The situation with statements is somewhat different. As we said, a statement |
|
|
|
doesn't return a value; it changes state. A good initial guess would |
|
|
|
be that to evaluate a statement that starts in state `s` and terminates in state `n`, |
|
|
|
we would take as input `State s` and return `State n`. However, things are not |
|
|
|
quite as simple, thanks to `Break`. Not only does `Break` require |
|
|
|
special case logic to return control to the end of the `Loop`, but |
|
|
|
it also requires some additional consideration: in a statement |
|
|
|
of type `Stmt l s n`, evaluating `Break` can return `State l`. |
|
|
|
|
|
|
|
To implement this, we'll use the `Either` type. The `Left` constructor |
|
|
|
will be contain the state at the time of evaluating a `Break`, |
|
|
|
and will indicate to the interpreter that we're breaking out of a loop. |
|
|
|
On the other hand, the `Right` constructor will contain the state |
|
|
|
as produced by all other statements, which would be considered |
|
|
|
{{< sidenote "right" "left-right-note" "the \"normal\" case." >}} |
|
|
|
We use <code>Left</code> for the "abnormal" case because of |
|
|
|
Idris' (and Haskell's) convention to use it as such. For |
|
|
|
instance, the two languages define a <code>Monad</code> |
|
|
|
instance for <code>Either a</code> where <code>(>>=)</code> |
|
|
|
behaves very much like it does for <code>Maybe</code>, with |
|
|
|
<code>Left</code> being treated as <code>Nothing</code>, |
|
|
|
and <code>Right</code> as <code>Just</code>. We will |
|
|
|
use this instance to clean up some of our computations. |
|
|
|
{{< /sidenote >}} Note that this doesn't indicate an error: |
|
|
|
we need to represent the two states (breaking out of a loop |
|
|
|
and normal execution) to define our language's semantics. |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 88 95 >}} |
|
|
|
|
|
|
|
First, note the type. We return an `Either` value, which will |
|
|
|
contain `State l` (in the `Left` constructor) if a `Break` |
|
|
|
was evaluated, and `State n` (in the `Right` constructor) |
|
|
|
if execution went on without breaking. |
|
|
|
|
|
|
|
The `Store` case is rather simple. We use `setReg` to update the result |
|
|
|
of the register `r` with the result of evaluating `e`. Because |
|
|
|
a store doesn't cause us to start breaking out of a loop, |
|
|
|
we use `Right` to wrap the new state. |
|
|
|
|
|
|
|
The `If` case is also rather simple. Its condition is guaranteed |
|
|
|
to evaluate to a boolean, so it's sufficient for us to use |
|
|
|
Idris' `if` expression. We use the `prog` function here, which |
|
|
|
implements the evaluation of a whole program. We'll get to it |
|
|
|
momentarily. |
|
|
|
|
|
|
|
`Loop` is the most interesting case. We start by evaluating |
|
|
|
the program `p` serving as the loop body. The result of this |
|
|
|
computation will be either a state after a break, |
|
|
|
held in `Left`, or as the normal execution state, held |
|
|
|
in `Right`. The `(>>=)` operation will do nothing in |
|
|
|
the first case, and feed the resulting (normal) state |
|
|
|
to `stmt (Loop p)` in the second case. This is exactly |
|
|
|
what we want: if we broke out of the current iteration |
|
|
|
of the loop, we shouldn't continue on to the next iteration. |
|
|
|
At the end of evaluating both `p` and the recursive call to |
|
|
|
`stmt`, we'll either have exited normally, or via breaking |
|
|
|
out. We don't want to continue breaking out further, |
|
|
|
so we return the final state wrapped in `Right` in both cases. |
|
|
|
Finally, `Break` returns the current state wrapped in `Left`, |
|
|
|
beginning the process of breaking out. |
|
|
|
|
|
|
|
The task of `prog` is simply to sequence several statements |
|
|
|
together. The monadic bind operator, `(>>=)`, is again perfect |
|
|
|
for this task, since it "stops" when it hits a `Left`, but |
|
|
|
continues otherwise. This is the implementation: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 97 99 >}} |
|
|
|
|
|
|
|
Awesome! Let's try it out, shall we? I defined a quick `run` function |
|
|
|
as follows: |
|
|
|
|
|
|
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 101 102 >}} |
|
|
|
|
|
|
|
We then have: |
|
|
|
|
|
|
|
``` |
|
|
|
*TypesafeImp> run prodProg (MkState (0,0,0)) |
|
|
|
MkState (0, 9, 63) : State (IntTy, IntTy, IntTy) |
|
|
|
``` |
|
|
|
|
|
|
|
Which seems correct! The program multiplies seven by nine, |
|
|
|
and stops when register `A` reaches zero. Our test program |
|
|
|
runs, too: |
|
|
|
|
|
|
|
``` |
|
|
|
*TypesafeImp> run testProg (MkState (0,0,0)) |
|
|
|
MkState (1, 2, 3) : State (IntTy, IntTy, IntTy) |
|
|
|
``` |
|
|
|
|
|
|
|
This is the correct answer: `A` ends up being set to |
|
|
|
`1` in the `then` branch of the conditional statement, |
|
|
|
`B` is set to 2 right after, and `R`, the sum of `A` |
|
|
|
and `B`, is rightly `3`. |
|
|
|
|
|
|
|
As you can see, we didn't have to write any error handling |
|
|
|
code! This is because the typechecker _knows_ that type errors |
|
|
|
aren't possible: our programs are guaranteed to be |
|
|
|
{{< sidenote "right" "termination-note" "type correct." >}} |
|
|
|
Our programs <em>aren't</em> guaranteed to terminate: |
|
|
|
we're lucky that Idris' totality checker is turned off by default. |
|
|
|
{{< /sidenote >}} |
|
|
|
|
|
|
|
This was a fun exercise, and I hope you enjoyed reading along! |
|
|
|
I hope to see you in my future posts. |
|
|
|