Update typesafe imperative language post draft.
This commit is contained in:
parent
52abe73ef7
commit
b459e9cbfe
|
@ -2,20 +2,20 @@ data Reg = A | B | R
|
||||||
|
|
||||||
data Ty = IntTy | BoolTy
|
data Ty = IntTy | BoolTy
|
||||||
|
|
||||||
RegState : Type
|
TypeState : Type
|
||||||
RegState = (Ty, Ty, Ty)
|
TypeState = (Ty, Ty, Ty)
|
||||||
|
|
||||||
getRegTy : Reg -> RegState -> Ty
|
getRegTy : Reg -> TypeState -> Ty
|
||||||
getRegTy A (a, _, _) = a
|
getRegTy A (a, _, _) = a
|
||||||
getRegTy B (_, b, _) = b
|
getRegTy B (_, b, _) = b
|
||||||
getRegTy R (_, _, r) = r
|
getRegTy R (_, _, r) = r
|
||||||
|
|
||||||
setRegTy : Reg -> Ty -> RegState -> RegState
|
setRegTy : Reg -> Ty -> TypeState -> TypeState
|
||||||
setRegTy A a (_, b, r) = (a, b, r)
|
setRegTy A a (_, b, r) = (a, b, r)
|
||||||
setRegTy B b (a, _, r) = (a, b, r)
|
setRegTy B b (a, _, r) = (a, b, r)
|
||||||
setRegTy R r (a, b, _) = (a, b, r)
|
setRegTy R r (a, b, _) = (a, b, r)
|
||||||
|
|
||||||
data Expr : RegState -> Ty -> Type where
|
data Expr : TypeState -> Ty -> Type where
|
||||||
Lit : Int -> Expr s IntTy
|
Lit : Int -> Expr s IntTy
|
||||||
Load : (r : Reg) -> Expr s (getRegTy r s)
|
Load : (r : Reg) -> Expr s (getRegTy r s)
|
||||||
Add : Expr s IntTy -> Expr s IntTy -> Expr s IntTy
|
Add : Expr s IntTy -> Expr s IntTy -> Expr s IntTy
|
||||||
|
@ -23,17 +23,17 @@ data Expr : RegState -> Ty -> Type where
|
||||||
Not : Expr s BoolTy -> Expr s BoolTy
|
Not : Expr s BoolTy -> Expr s BoolTy
|
||||||
|
|
||||||
mutual
|
mutual
|
||||||
data Stmt : RegState -> RegState -> RegState -> Type where
|
data Stmt : TypeState -> TypeState -> TypeState -> Type where
|
||||||
Store : (r : Reg) -> Expr s t -> Stmt l s (setRegTy r t s)
|
Store : (r : Reg) -> Expr s t -> Stmt l s (setRegTy r t s)
|
||||||
If : Expr s BoolTy -> Prog l s n -> Prog l s n -> Stmt l s n
|
If : Expr s BoolTy -> Prog l s n -> Prog l s n -> Stmt l s n
|
||||||
Loop : Prog s s s -> Stmt l s s
|
Loop : Prog s s s -> Stmt l s s
|
||||||
Break : Stmt s s s
|
Break : Stmt s s s
|
||||||
|
|
||||||
data Prog : RegState -> RegState -> RegState -> Type where
|
data Prog : TypeState -> TypeState -> TypeState -> Type where
|
||||||
Nil : Prog l s s
|
Nil : Prog l s s
|
||||||
(::) : Stmt l s n -> Prog l n m -> Prog l s m
|
(::) : Stmt l s n -> Prog l n m -> Prog l s m
|
||||||
|
|
||||||
initialState : RegState
|
initialState : TypeState
|
||||||
initialState = (IntTy, IntTy, IntTy)
|
initialState = (IntTy, IntTy, IntTy)
|
||||||
|
|
||||||
testProg : Prog Main.initialState Main.initialState Main.initialState
|
testProg : Prog Main.initialState Main.initialState Main.initialState
|
||||||
|
@ -64,7 +64,7 @@ repr : Ty -> Type
|
||||||
repr IntTy = Int
|
repr IntTy = Int
|
||||||
repr BoolTy = Bool
|
repr BoolTy = Bool
|
||||||
|
|
||||||
data State : RegState -> Type where
|
data State : TypeState -> Type where
|
||||||
MkState : (repr a, repr b, repr c) -> State (a, b, c)
|
MkState : (repr a, repr b, repr c) -> State (a, b, c)
|
||||||
|
|
||||||
getReg : (r : Reg) -> State s -> repr (getRegTy r s)
|
getReg : (r : Reg) -> State s -> repr (getRegTy r s)
|
||||||
|
@ -97,3 +97,6 @@ mutual
|
||||||
prog : Prog l s n -> State s -> Either (State l) (State n)
|
prog : Prog l s n -> State s -> Either (State l) (State n)
|
||||||
prog Nil s = Right s
|
prog Nil s = Right s
|
||||||
prog (st::p) s = stmt st s >>= prog p
|
prog (st::p) s = stmt st s >>= prog p
|
||||||
|
|
||||||
|
run : Prog l s l -> State s -> State l
|
||||||
|
run p s = either id id $ prog p s
|
||||||
|
|
|
@ -246,7 +246,7 @@ The following (type-correct) program compiles just fine:
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 36 47 >}}
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 36 47 >}}
|
||||||
|
|
||||||
First, it loads a boolean (`True`, to be exact) into register `A`; then,
|
First, it loads a boolean into register `A`; then,
|
||||||
inside the `if/else` statement, it stores an integer into `A`. Finally,
|
inside the `if/else` statement, it stores an integer into `A`. Finally,
|
||||||
it stores another integer into `B`, and adds them into `R`. Even though
|
it stores another integer into `B`, and adds them into `R`. Even though
|
||||||
`A` was a boolean at first, the type checker can deduce that it
|
`A` was a boolean at first, the type checker can deduce that it
|
||||||
|
@ -307,3 +307,173 @@ Type mismatch between IntTy and BoolTy
|
||||||
And so, we have an encoding of our language that allows registers to
|
And so, we have an encoding of our language that allows registers to
|
||||||
be either integers or booleans, while still preventing
|
be either integers or booleans, while still preventing
|
||||||
type-incorrect programs!
|
type-incorrect programs!
|
||||||
|
|
||||||
|
### Building an Interpreter
|
||||||
|
A good test of such an encoding is the implementation
|
||||||
|
of an interpreter. It should be possible to convince the
|
||||||
|
typechecker that our interpreter doesn't need to
|
||||||
|
handle type errors in the toy language, since they
|
||||||
|
cannot occur.
|
||||||
|
|
||||||
|
Let's start with something simple. First of all, suppose
|
||||||
|
we have an expression of type `Expr InTy`. In our toy
|
||||||
|
language, it produces an integer. Our interpreter, then,
|
||||||
|
will probably want to use Idris' type `Int`. Similarly,
|
||||||
|
an expression of type `Expr BoolTy` will produce a boolean
|
||||||
|
in our toy language, which in Idris we can represent using
|
||||||
|
the built-in `Bool` type. Despite the similar naming, though,
|
||||||
|
there's no connection between Idris' `Bool` and our own `BoolTy`.
|
||||||
|
We need to define a conversion from our own types -- which are
|
||||||
|
values of type `Ty` -- into Idris types that result from evaluating
|
||||||
|
expressions. We do so as follows:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 63 65 >}}
|
||||||
|
|
||||||
|
Similarly, we want to convert our `TypeState` (a tuple describing the _types_
|
||||||
|
of our registers) into a tuple that actually holds the values of each
|
||||||
|
register, which we will call `State`. The value of each register at
|
||||||
|
any point depends on its type. My first thought was to define
|
||||||
|
`State` as a function from `TypeState` to an Idris `Type`:
|
||||||
|
|
||||||
|
```Idris
|
||||||
|
State : TypeState -> Type
|
||||||
|
State (a, b, c) = (repr a, repr b, repr c)
|
||||||
|
```
|
||||||
|
|
||||||
|
Unfortunately, this doesn't quite cut it. The problem is that this
|
||||||
|
function technically doesn't give Idris any guarantees that `State`
|
||||||
|
will be a tuple. The most Idris knows is that `State` will be some
|
||||||
|
`Type`, which could be `Int`, `Bool`, or anything else! This
|
||||||
|
becomes a problem when we try to pattern match on states to get
|
||||||
|
the contents of a particular register. Instead, we have to define
|
||||||
|
a new data type:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 67 68 >}}
|
||||||
|
|
||||||
|
In this snippet, `State` is still a (type level) function from `TypeState` to `Type`.
|
||||||
|
However, by using a GADT, we guarantee that there's only one way to construct
|
||||||
|
a `State (a,b,c)`: using a corresponding tuple. Now, Idris will accept our
|
||||||
|
pattern matching:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 70 78 >}}
|
||||||
|
|
||||||
|
The `getReg` function will take out the value of the corresponding
|
||||||
|
register, returning `Int` or `Bool` depending on the `TypeState`.
|
||||||
|
What's important is that if the `TypeState` is known, then so
|
||||||
|
is the type of `getReg`: no `Either` is involved her, and we
|
||||||
|
can directly use the integer or boolean stored in the
|
||||||
|
register. This is exactly what we do:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 80 85 >}}
|
||||||
|
|
||||||
|
This is pretty concise. Idris knows that `Lit i` is of type `Expr IntTy`,
|
||||||
|
and it knows that `repr IntTy = Int`, so it also knows that
|
||||||
|
`eval (Lit i)` produces an `Int`. Similarly, we wrote
|
||||||
|
`Reg r` to have type `Expr s (getRegTy r s)`. Since `getReg`
|
||||||
|
returns `repr (getRegTy r s)`, things check out here, too.
|
||||||
|
A similar logic applies to the rest of the cases.
|
||||||
|
|
||||||
|
The situation with statements is somewhat different. As we said, a statement
|
||||||
|
doesn't return a value; it changes state. A good initial guess would
|
||||||
|
be that to evaluate a statement that starts in state `s` and terminates in state `n`,
|
||||||
|
we would take as input `State s` and return `State n`. However, things are not
|
||||||
|
quite as simple, thanks to `Break`. Not only does `Break` require
|
||||||
|
special case logic to return control to the end of the `Loop`, but
|
||||||
|
it also requires some additional consideration: in a statement
|
||||||
|
of type `Stmt l s n`, evaluating `Break` can return `State l`.
|
||||||
|
|
||||||
|
To implement this, we'll use the `Either` type. The `Left` constructor
|
||||||
|
will be contain the state at the time of evaluating a `Break`,
|
||||||
|
and will indicate to the interpreter that we're breaking out of a loop.
|
||||||
|
On the other hand, the `Right` constructor will contain the state
|
||||||
|
as produced by all other statements, which would be considered
|
||||||
|
{{< sidenote "right" "left-right-note" "the \"normal\" case." >}}
|
||||||
|
We use <code>Left</code> for the "abnormal" case because of
|
||||||
|
Idris' (and Haskell's) convention to use it as such. For
|
||||||
|
instance, the two languages define a <code>Monad</code>
|
||||||
|
instance for <code>Either a</code> where <code>(>>=)</code>
|
||||||
|
behaves very much like it does for <code>Maybe</code>, with
|
||||||
|
<code>Left</code> being treated as <code>Nothing</code>,
|
||||||
|
and <code>Right</code> as <code>Just</code>. We will
|
||||||
|
use this instance to clean up some of our computations.
|
||||||
|
{{< /sidenote >}} Note that this doesn't indicate an error:
|
||||||
|
we need to represent the two states (breaking out of a loop
|
||||||
|
and normal execution) to define our language's semantics.
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 88 95 >}}
|
||||||
|
|
||||||
|
First, note the type. We return an `Either` value, which will
|
||||||
|
contain `State l` (in the `Left` constructor) if a `Break`
|
||||||
|
was evaluated, and `State n` (in the `Right` constructor)
|
||||||
|
if execution went on without breaking.
|
||||||
|
|
||||||
|
The `Store` case is rather simple. We use `setReg` to update the result
|
||||||
|
of the register `r` with the result of evaluating `e`. Because
|
||||||
|
a store doesn't cause us to start breaking out of a loop,
|
||||||
|
we use `Right` to wrap the new state.
|
||||||
|
|
||||||
|
The `If` case is also rather simple. Its condition is guaranteed
|
||||||
|
to evaluate to a boolean, so it's sufficient for us to use
|
||||||
|
Idris' `if` expression. We use the `prog` function here, which
|
||||||
|
implements the evaluation of a whole program. We'll get to it
|
||||||
|
momentarily.
|
||||||
|
|
||||||
|
`Loop` is the most interesting case. We start by evaluating
|
||||||
|
the program `p` serving as the loop body. The result of this
|
||||||
|
computation will be either a state after a break,
|
||||||
|
held in `Left`, or as the normal execution state, held
|
||||||
|
in `Right`. The `(>>=)` operation will do nothing in
|
||||||
|
the first case, and feed the resulting (normal) state
|
||||||
|
to `stmt (Loop p)` in the second case. This is exactly
|
||||||
|
what we want: if we broke out of the current iteration
|
||||||
|
of the loop, we shouldn't continue on to the next iteration.
|
||||||
|
At the end of evaluating both `p` and the recursive call to
|
||||||
|
`stmt`, we'll either have exited normally, or via breaking
|
||||||
|
out. We don't want to continue breaking out further,
|
||||||
|
so we return the final state wrapped in `Right` in both cases.
|
||||||
|
Finally, `Break` returns the current state wrapped in `Left`,
|
||||||
|
beginning the process of breaking out.
|
||||||
|
|
||||||
|
The task of `prog` is simply to sequence several statements
|
||||||
|
together. The monadic bind operator, `(>>=)`, is again perfect
|
||||||
|
for this task, since it "stops" when it hits a `Left`, but
|
||||||
|
continues otherwise. This is the implementation:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 97 99 >}}
|
||||||
|
|
||||||
|
Awesome! Let's try it out, shall we? I defined a quick `run` function
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 101 102 >}}
|
||||||
|
|
||||||
|
We then have:
|
||||||
|
|
||||||
|
```
|
||||||
|
*TypesafeImp> run prodProg (MkState (0,0,0))
|
||||||
|
MkState (0, 9, 63) : State (IntTy, IntTy, IntTy)
|
||||||
|
```
|
||||||
|
|
||||||
|
Which seems correct! The program multiplies seven by nine,
|
||||||
|
and stops when register `A` reaches zero. Our test program
|
||||||
|
runs, too:
|
||||||
|
|
||||||
|
```
|
||||||
|
*TypesafeImp> run testProg (MkState (0,0,0))
|
||||||
|
MkState (1, 2, 3) : State (IntTy, IntTy, IntTy)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the correct answer: `A` ends up being set to
|
||||||
|
`1` in the `then` branch of the conditional statement,
|
||||||
|
`B` is set to 2 right after, and `R`, the sum of `A`
|
||||||
|
and `B`, is rightly `3`.
|
||||||
|
|
||||||
|
As you can see, we didn't have to write any error handling
|
||||||
|
code! This is because the typechecker _knows_ that type errors
|
||||||
|
aren't possible: our programs are guaranteed to be
|
||||||
|
{{< sidenote "right" "termination-note" "type correct." >}}
|
||||||
|
Our programs <em>aren't</em> guaranteed to terminate:
|
||||||
|
we're lucky that Idris' totality checker is turned off by default.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
This was a fun exercise, and I hope you enjoyed reading along!
|
||||||
|
I hope to see you in my future posts.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user