Make the typesafe imperative language work properly.
This commit is contained in:
parent
f0fe481bcf
commit
52abe73ef7
|
@ -23,20 +23,20 @@ data Expr : RegState -> Ty -> Type where
|
|||
Not : Expr s BoolTy -> Expr s BoolTy
|
||||
|
||||
mutual
|
||||
data Stmt : RegState -> RegState -> Type where
|
||||
Store : (r : Reg) -> Expr s t -> Stmt s (setRegTy r t s)
|
||||
If : Expr s BoolTy -> Prog s n -> Prog s n -> Stmt s n
|
||||
Loop : Prog s s -> Stmt s s
|
||||
data Stmt : RegState -> RegState -> RegState -> Type where
|
||||
Store : (r : Reg) -> Expr s t -> Stmt l s (setRegTy r t s)
|
||||
If : Expr s BoolTy -> Prog l s n -> Prog l s n -> Stmt l s n
|
||||
Loop : Prog s s s -> Stmt l s s
|
||||
Break : Stmt s s s
|
||||
|
||||
data Prog : RegState -> RegState -> Type where
|
||||
Nil : Prog s s
|
||||
Break : Prog s s
|
||||
(::) : Stmt s n -> Prog n m -> Prog s m
|
||||
data Prog : RegState -> RegState -> RegState -> Type where
|
||||
Nil : Prog l s s
|
||||
(::) : Stmt l s n -> Prog l n m -> Prog l s m
|
||||
|
||||
initialState : RegState
|
||||
initialState = (IntTy, IntTy, IntTy)
|
||||
|
||||
testProg : Prog Main.initialState Main.initialState
|
||||
testProg : Prog Main.initialState Main.initialState Main.initialState
|
||||
testProg =
|
||||
[ Store A (Lit 1 `Leq` Lit 2)
|
||||
, If (Load A)
|
||||
|
@ -45,3 +45,55 @@ testProg =
|
|||
, Store B (Lit 2)
|
||||
, Store R (Add (Load A) (Load B))
|
||||
]
|
||||
|
||||
prodProg : Prog Main.initialState Main.initialState Main.initialState
|
||||
prodProg =
|
||||
[ Store A (Lit 7)
|
||||
, Store B (Lit 9)
|
||||
, Store R (Lit 0)
|
||||
, Loop
|
||||
[ If (Load A `Leq` Lit 0)
|
||||
[ Break ]
|
||||
[ Store R (Load R `Add` Load B)
|
||||
, Store A (Load A `Add` Lit (-1))
|
||||
]
|
||||
]
|
||||
]
|
||||
|
||||
repr : Ty -> Type
|
||||
repr IntTy = Int
|
||||
repr BoolTy = Bool
|
||||
|
||||
data State : RegState -> Type where
|
||||
MkState : (repr a, repr b, repr c) -> State (a, b, c)
|
||||
|
||||
getReg : (r : Reg) -> State s -> repr (getRegTy r s)
|
||||
getReg A (MkState (a, _, _)) = a
|
||||
getReg B (MkState (_, b, _)) = b
|
||||
getReg R (MkState (_, _, r)) = r
|
||||
|
||||
setReg : (r : Reg) -> repr t -> State s -> State (setRegTy r t s)
|
||||
setReg A a (MkState (_, b, r)) = MkState (a, b, r)
|
||||
setReg B b (MkState (a, _, r)) = MkState (a, b, r)
|
||||
setReg R r (MkState (a, b, _)) = MkState (a, b, r)
|
||||
|
||||
expr : Expr s t -> State s -> repr t
|
||||
expr (Lit i) _ = i
|
||||
expr (Load r) s = getReg r s
|
||||
expr (Add l r) s = expr l s + expr r s
|
||||
expr (Leq l r) s = expr l s <= expr r s
|
||||
expr (Not e) s = not $ expr e s
|
||||
|
||||
mutual
|
||||
stmt : Stmt l s n -> State s -> Either (State l) (State n)
|
||||
stmt (Store r e) s = Right $ setReg r (expr e s) s
|
||||
stmt (If c t e) s = if expr c s then prog t s else prog e s
|
||||
stmt (Loop p) s =
|
||||
case prog p s >>= stmt (Loop p) of
|
||||
Right s => Right s
|
||||
Left s => Right s
|
||||
stmt Break s = Left s
|
||||
|
||||
prog : Prog l s n -> State s -> Either (State l) (State n)
|
||||
prog Nil s = Right s
|
||||
prog (st::p) s = stmt st s >>= prog p
|
||||
|
|
|
@ -49,7 +49,7 @@ representation then, wouldn't it?
|
|||
I am pretty certain that a similar encoding in Haskell is possible. However,
|
||||
Haskell wasn't originally created for that kind of abuse of its type system,
|
||||
so it would probably not look very good.
|
||||
{{< /sidenote >}} I am pretty certain that it _is_ possible to do this
|
||||
{{< /sidenote >}} I am sure that it _is_ possible to do this
|
||||
in Idris, a dependently typed programming language. In this post I will
|
||||
talk about how to do that.
|
||||
|
||||
|
@ -109,12 +109,12 @@ types of each register. To do so, we add the state as a parameter to
|
|||
our `Expr` type. This would lead to types like the following:
|
||||
|
||||
```Idris
|
||||
-- An expression that produces a boolean
|
||||
-- when all the registers are integers.
|
||||
-- An expression that produces a boolean when all the registers
|
||||
-- are integers.
|
||||
Expr (IntTy, IntTy, IntTy) BoolTy
|
||||
|
||||
-- An expression that produces an integer
|
||||
-- when A and B are integers, and R is a boolean.
|
||||
-- An expression that produces an integer when A and B are integers,
|
||||
-- and R is a boolean.
|
||||
Expr (IntTy, IntTy, BoolTy) IntTy
|
||||
```
|
||||
|
||||
|
@ -129,54 +129,115 @@ creates an expression that has `r`'s type in the current state `s`.
|
|||
Statements are a bit different. Unlike expressions, they don't evaluate to
|
||||
anything; rather, they do something. That "something" may very well be changing
|
||||
the current state. We could, for instance, set `A` to be a boolean, while it was
|
||||
previously an integer. So, the `Stmt` type will take two arguments: the initial
|
||||
state and the final state. This leads to the following definition:
|
||||
previously an integer. This suggests equipping our `Stmt` type with two
|
||||
arguments: the initial state (before the statement's execution), and the final
|
||||
state (after the statement's execution). This would lead to types like this:
|
||||
|
||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 26 29 >}}
|
||||
```Idris
|
||||
-- Statement that, when run while all registers contain integers,
|
||||
-- terminates with registers B and R having been assigned boolean values.
|
||||
Stmt (IntTy, IntTy, IntTy) (IntTy, BoolTy, BoolTy)
|
||||
```
|
||||
|
||||
However, there's a problem with `loop` and `break`. When we run a loop,
|
||||
we will require that the state at the end of one iteration is the
|
||||
same as the state at its beginning. Otherwise, it would be possible
|
||||
for a loop to keep changing the types of registers every iteration,
|
||||
and it would become impossible for us to infer the final state
|
||||
without actually running the program. In itself, this restriction
|
||||
isn't a problem; most static type systems require both branches
|
||||
of an `if/else` expression to be of the same type for a similar
|
||||
reason. The problem comes from the interaction with `break`.
|
||||
|
||||
By itself, the would-be type of `break` seems innocent enough. It
|
||||
doesn't change any registers, so we could call it `Stmt s s`.
|
||||
But consider the following program:
|
||||
|
||||
```
|
||||
A := 0
|
||||
B := 0
|
||||
R := 0
|
||||
do
|
||||
if 5 <= A then
|
||||
B := 1 <= 1
|
||||
break
|
||||
B := 0
|
||||
else
|
||||
A := A + 1
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
The loop starts with all registers having integer values.
|
||||
As per our aforementioned loop requirement, the body
|
||||
of the loop must terminate with all registers _still_ having
|
||||
integer values. For the first five iterations that's exactly
|
||||
what will happen. However, after we increment `A` the fifth time,
|
||||
we will set `B` to a boolean value -- using a valid statement --
|
||||
and then `break`. The `break` statement will be accepted by
|
||||
the typechecker, and so will the whole `then` branch. After all,
|
||||
it seems as though we reset `B` back to an integer value.
|
||||
But that won't be the case. We will have jumped to the end
|
||||
of the loop, where we are expected to have an all-integer type,
|
||||
which we will not have.
|
||||
|
||||
The solution I came up with to address this issue was to
|
||||
add a _third_ argument to `Stmt`, which contains the "context"
|
||||
type. That is, it contains the type of the innermost loop surrounding
|
||||
the statement. A `break` statement would only be permissible
|
||||
if the current type matches the loop type. With this, we finally
|
||||
write down a definition of `Stmt`:
|
||||
|
||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 26 30 >}}
|
||||
|
||||
The `Store` constructor takes a register `r` and an expression producing some type `t` in state `s`.
|
||||
From these, it creates a statement that starts in `s`, and finishes
|
||||
in a state similar to `s`, but with `r` now having type `t`.
|
||||
in a state similar to `s`, but with `r` now having type `t`. The loop
|
||||
type `l` remains unaffected and unused; we are free to assign any register
|
||||
any value.
|
||||
|
||||
The `If` constructor takes a condition, which starts in state `s` and _must_ produce
|
||||
The `If` constructor takes a condition `Expr`, which starts in state `s` and _must_ produce
|
||||
a boolean. It also takes two programs (sequences of statements), each of which
|
||||
start in `s` and finishes in another state `n`. Then, the `If` constructor creates
|
||||
starts in `s` and finishes in another state `n`. This results in
|
||||
a statement that starts in state `s`, and finishes in state `n`. Conceptually,
|
||||
each branch of the `if/else` statement must result in the same final state (in terms of types);
|
||||
otherwise, we wouldn't know which of the states to pick when deciding the final
|
||||
state of the `If` itself.
|
||||
state of the `If` itself. As with `Store`, the loop type `l` is untouched here.
|
||||
Individual statements are free to modify the state however they wish.
|
||||
|
||||
The `Loop` constructor is even more restrictive: it takes a single program (the sequence
|
||||
of instructions that it will be repeating). This program starts _and_ ends in state `s`;
|
||||
since the loop can repeat many times, and since we're repeating the same program,
|
||||
we want to make sure that program is always run from the same initial state.
|
||||
The `Loop` constructor is very restrictive. It takes a single program (the sequence
|
||||
of instructions that it will be repeating). As we discussed above, this program
|
||||
must start _and_ end in the same state `s`. Furthermore, this program's loop
|
||||
type must also be `s`, since the loop we're constructing will be surrounding the
|
||||
program. The resulting loop itself still has an arbitrary loop type `l`, since
|
||||
it doesn't surround itself.
|
||||
|
||||
I chose __not__ to encode `Break` as a statement. This is because we don't want `Break`s occurring
|
||||
in the middle of a program! Otherwise, it would be possible to write a program
|
||||
that _seems_ like it will terminate in one state, but, because of a break in the middle,
|
||||
terminates in another! Instead, we'll encode `Break` as a part of the `Prog` encoding.
|
||||
Finally, `Break` can only be constructed when the loop state matches the current
|
||||
state. Since we'll be jumping to the end of the innermost loop, the final state
|
||||
is also the same as the loop state.
|
||||
|
||||
These are all the constructors we'll be needing. It's time to move on to
|
||||
whole programs!
|
||||
|
||||
### Programs
|
||||
A program is basically a list of statements. However, we can't use a regular Idris list for two
|
||||
reasons:
|
||||
A program is simply a list of statements. However, we can't use a regular Idris list,
|
||||
because a regular list wouldn't be able to represent the relationship between
|
||||
each successive statement. In our program, we want the final state of one
|
||||
statement to be the initial state of the following one, since they'll
|
||||
be executed in sequence. To represent this, we have to define our own
|
||||
list-like GADT. The definition of the type turns out fairly straightforward:
|
||||
|
||||
1. Our type is not as simple as `[Stmt]`. We want each statement to begin in the state that the
|
||||
previous statement ended in; we will have to do some work to ensure that.
|
||||
2. We have two ways of ending the sequence of statements: either with or without a `break`.
|
||||
Thus, instead of having a single `Nil` constructor, we'll have two.
|
||||
|
||||
The definition of the type turns out fairly straightforward:
|
||||
|
||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 31 34 >}}
|
||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 32 34 >}}
|
||||
|
||||
The `Nil` constructor represents an empty program (much like the built-in `Nil` represents an empty list).
|
||||
Since no actions are done, it creates a `Prog` that starts and ends in the same state: `s`.
|
||||
The `Break` constructor is similar; however, it represents a `break` instruction, and thus,
|
||||
must be distinct from the regular `End` constructor. Finally, the `(::)` constructor, much like
|
||||
the built-in `(::)` constructor, takes a statement and another program. The statement begins in
|
||||
state `s` and ends in state `n`; the program after that statement must then start in state `n`,
|
||||
and end in some other state `m`. The combination of the statement and the program starts in state `s`,
|
||||
and finishes in state `m`; thus, `(::)` yields `Prog s m`.
|
||||
The `(::)` constructor, much like the built-in `(::)` constructor, takes a statement
|
||||
and another program. The statement begins in state `s` and ends in state `n`; the program after
|
||||
that statement must then start in state `n`, and end in some other state `m`.
|
||||
The combination of the statement and the program starts in state `s`, and finishes in state `m`.
|
||||
Thus, `(::)` yields `Prog s m`. None of the constructors affect the loop type `l`: we
|
||||
are free to sequence any statements that we want, and it is impossible for us
|
||||
to construct statements using `l` that cause runtime errors.
|
||||
|
||||
This should be all! Let's try out some programs.
|
||||
|
||||
|
@ -192,7 +253,7 @@ it stores another integer into `B`, and adds them into `R`. Even though
|
|||
was reset back to an integer after the `if/else`, and the program is accepted.
|
||||
On the other hand, had we forgotten to set `A` to a boolean first:
|
||||
|
||||
```
|
||||
```Idris
|
||||
[ If (Load A)
|
||||
[ Store A (Lit 1) ]
|
||||
[ Store A (Lit 2) ]
|
||||
|
@ -204,17 +265,45 @@ On the other hand, had we forgotten to set `A` to a boolean first:
|
|||
We would get a type error:
|
||||
|
||||
```
|
||||
Type mismatch between
|
||||
getRegTy A (IntTy, IntTy, IntTy)
|
||||
and
|
||||
BoolTy
|
||||
Type mismatch between getRegTy A (IntTy, IntTy, IntTy) and BoolTy
|
||||
```
|
||||
|
||||
The type of register `A` (that is, `IntTy`) is incompatible
|
||||
with `BoolTy`. Our `initialState` says that `A` starts out as
|
||||
an integer, so it can't be used in an `if/else` right away!
|
||||
Similar errors occur if we make one of the branches of
|
||||
the `if/else` empty, or if we set `B` to a boolean. And so,
|
||||
we have an encoding of our language that allows registers to
|
||||
the `if/else` empty, or if we set `B` to a boolean.
|
||||
|
||||
We can also encode the example program from the beginning
|
||||
of this post:
|
||||
|
||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 49 61 >}}
|
||||
|
||||
This program compiles just fine, too! It is a little reminiscent of
|
||||
the program we used to demonstrate how `break` could break things
|
||||
if we weren't careful. So, let's go ahead and try `break` in an invalid
|
||||
state:
|
||||
|
||||
```Idris
|
||||
[ Store A (Lit 7)
|
||||
, Store B (Lit 9)
|
||||
, Store R (Lit 0)
|
||||
, Loop
|
||||
[ If (Load A `Leq` Lit 0)
|
||||
[ Store B (Lit 1 `Leq` Lit 1), Break, Store B (Lit 0) ]
|
||||
[ Store R (Load R `Add` Load B)
|
||||
, Store A (Load A `Add` Lit (-1))
|
||||
]
|
||||
]
|
||||
]
|
||||
```
|
||||
|
||||
Again, the type checker complains:
|
||||
|
||||
```
|
||||
Type mismatch between IntTy and BoolTy
|
||||
```
|
||||
|
||||
And so, we have an encoding of our language that allows registers to
|
||||
be either integers or booleans, while still preventing
|
||||
type-incorrect programs!
|
||||
|
|
Loading…
Reference in New Issue
Block a user