Compare commits
No commits in common. "f0fe481bcf6c90aba07a9a01b00e9cfa4513a93c" and "e7edd43034ce95d6ef7f5b8d89a612d09a2ca7a4" have entirely different histories.
f0fe481bcf
...
e7edd43034
@ -1,47 +0,0 @@
|
|||||||
data Reg = A | B | R
|
|
||||||
|
|
||||||
data Ty = IntTy | BoolTy
|
|
||||||
|
|
||||||
RegState : Type
|
|
||||||
RegState = (Ty, Ty, Ty)
|
|
||||||
|
|
||||||
getRegTy : Reg -> RegState -> Ty
|
|
||||||
getRegTy A (a, _, _) = a
|
|
||||||
getRegTy B (_, b, _) = b
|
|
||||||
getRegTy R (_, _, r) = r
|
|
||||||
|
|
||||||
setRegTy : Reg -> Ty -> RegState -> RegState
|
|
||||||
setRegTy A a (_, b, r) = (a, b, r)
|
|
||||||
setRegTy B b (a, _, r) = (a, b, r)
|
|
||||||
setRegTy R r (a, b, _) = (a, b, r)
|
|
||||||
|
|
||||||
data Expr : RegState -> Ty -> Type where
|
|
||||||
Lit : Int -> Expr s IntTy
|
|
||||||
Load : (r : Reg) -> Expr s (getRegTy r s)
|
|
||||||
Add : Expr s IntTy -> Expr s IntTy -> Expr s IntTy
|
|
||||||
Leq : Expr s IntTy -> Expr s IntTy -> Expr s BoolTy
|
|
||||||
Not : Expr s BoolTy -> Expr s BoolTy
|
|
||||||
|
|
||||||
mutual
|
|
||||||
data Stmt : RegState -> RegState -> Type where
|
|
||||||
Store : (r : Reg) -> Expr s t -> Stmt s (setRegTy r t s)
|
|
||||||
If : Expr s BoolTy -> Prog s n -> Prog s n -> Stmt s n
|
|
||||||
Loop : Prog s s -> Stmt s s
|
|
||||||
|
|
||||||
data Prog : RegState -> RegState -> Type where
|
|
||||||
Nil : Prog s s
|
|
||||||
Break : Prog s s
|
|
||||||
(::) : Stmt s n -> Prog n m -> Prog s m
|
|
||||||
|
|
||||||
initialState : RegState
|
|
||||||
initialState = (IntTy, IntTy, IntTy)
|
|
||||||
|
|
||||||
testProg : Prog Main.initialState Main.initialState
|
|
||||||
testProg =
|
|
||||||
[ Store A (Lit 1 `Leq` Lit 2)
|
|
||||||
, If (Load A)
|
|
||||||
[ Store A (Lit 1) ]
|
|
||||||
[ Store A (Lit 2) ]
|
|
||||||
, Store B (Lit 2)
|
|
||||||
, Store R (Add (Load A) (Load B))
|
|
||||||
]
|
|
@ -1,220 +0,0 @@
|
|||||||
---
|
|
||||||
title: "A Typesafe Representation of an Imperative Language"
|
|
||||||
date: 2020-10-30T17:19:59-07:00
|
|
||||||
tags: ["Idris"]
|
|
||||||
draft: true
|
|
||||||
---
|
|
||||||
|
|
||||||
A recent homework assignment for my university's programming languages
|
|
||||||
course was to encode the abstract syntax for a small imperative language
|
|
||||||
into Haskell data types. The language consisted of very few constructs, and was very much a "toy".
|
|
||||||
On the expression side of things, it had three registers (`A`, `B`, and `R`),
|
|
||||||
numbers, addition, comparison using "less than", and logical negation. It also
|
|
||||||
included a statement for storing the result of an expression into
|
|
||||||
a register, `if/else`, and an infinite loop construct with an associated `break` operation.
|
|
||||||
A sample program in the language which computes the product of two
|
|
||||||
numbers is as follows:
|
|
||||||
|
|
||||||
```
|
|
||||||
A := 7
|
|
||||||
B := 9
|
|
||||||
R := 0
|
|
||||||
do
|
|
||||||
if A <= 0 then
|
|
||||||
break
|
|
||||||
else
|
|
||||||
R := R + B;
|
|
||||||
A := A + -1;
|
|
||||||
end
|
|
||||||
end
|
|
||||||
```
|
|
||||||
|
|
||||||
The homework notes that type errors may arise in the little imperative language.
|
|
||||||
We could, for instance, try to add a boolean to a number: `3 + (1 < 2)`. Alternatively,
|
|
||||||
we could try use a number in the condition of an `if/else` expression. A "naive"
|
|
||||||
encoding of the abstract syntax would allow for such errors.
|
|
||||||
|
|
||||||
However, assuming that registers could only store integers and not booleans, it is fairly easy to
|
|
||||||
separate the expression grammar into two nonterminals, yielding boolean
|
|
||||||
and integer expressions respectively. Since registers can only store integers,
|
|
||||||
the `(:=)` operation will always require an integer expression, and an `if/else`
|
|
||||||
statement will always require a boolean expression. A matching Haskell encoding
|
|
||||||
would not allow "invalid" programs to compile. That is, the programs would be
|
|
||||||
type-correct by construction.
|
|
||||||
|
|
||||||
Then, a question arose in the ensuing discussion: what if registers _could_
|
|
||||||
contain booleans? It would be impossible to create such a "correct-by-construction"
|
|
||||||
representation then, wouldn't it?
|
|
||||||
{{< sidenote "right" "haskell-note" "Although I don't know about Haskell," >}}
|
|
||||||
I am pretty certain that a similar encoding in Haskell is possible. However,
|
|
||||||
Haskell wasn't originally created for that kind of abuse of its type system,
|
|
||||||
so it would probably not look very good.
|
|
||||||
{{< /sidenote >}} I am pretty certain that it _is_ possible to do this
|
|
||||||
in Idris, a dependently typed programming language. In this post I will
|
|
||||||
talk about how to do that.
|
|
||||||
|
|
||||||
### Registers and Expressions
|
|
||||||
Let's start by encoding registers. Since we only have three registers, we
|
|
||||||
can encode them using a simple data type declaration, much the same as we
|
|
||||||
would in Haskell:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 1 1 >}}
|
|
||||||
|
|
||||||
Now that registers can store either integers or booleans (and only those two),
|
|
||||||
we need to know which one is which. For this purpose, we can declare another
|
|
||||||
data type:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 3 3 >}}
|
|
||||||
|
|
||||||
At any point in the (hypothetical) execution of our program, each
|
|
||||||
of the registers will have a type, either boolean or integer. The
|
|
||||||
combined state of the three registers would then be the combination
|
|
||||||
of these three states; we can represent this using a 3-tuple:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 5 6 >}}
|
|
||||||
|
|
||||||
Let's say that the first element of the tuple will be the type of the register
|
|
||||||
`A`, the second the type of `B`, and the third the type of `R`. Then,
|
|
||||||
we can define two helper functions, one for retrieving the type of
|
|
||||||
a register, and one for changing it:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 8 16 >}}
|
|
||||||
|
|
||||||
Now, it's time to talk about expressions. We know now that an expression
|
|
||||||
can evaluate to either a boolean or an integer value (because a register
|
|
||||||
can contain either of those types of values). Perhaps we can specify
|
|
||||||
the type that an expression evaluates to in the expression's own type:
|
|
||||||
`Expr IntTy` would evaluate to integers, and `Expr BoolTy` would evaluate
|
|
||||||
to booleans. Then, we could constructors as follows:
|
|
||||||
|
|
||||||
```Idris
|
|
||||||
Lit : Int -> Expr IntTy
|
|
||||||
Not : Expr BoolTy -> Expr BoolTy
|
|
||||||
```
|
|
||||||
|
|
||||||
Sounds good! But what about loading a register?
|
|
||||||
|
|
||||||
```Idris
|
|
||||||
Load : Reg -> Expr IntTy -- no; what if the register is a boolean?
|
|
||||||
Load : Reg -> Expr BoolTy -- no; what if the register is an integer?
|
|
||||||
Load : Reg -> Expr a -- no; a register access can't be either!
|
|
||||||
```
|
|
||||||
|
|
||||||
The type of an expression that loads a register depends on the current
|
|
||||||
state of the program! If we last stored an integer into a register,
|
|
||||||
then loading from that register would give us an integer. But if we
|
|
||||||
last stored a boolean into a register, then reading from it would
|
|
||||||
give us a boolean. Our expressions need to be aware of the current
|
|
||||||
types of each register. To do so, we add the state as a parameter to
|
|
||||||
our `Expr` type. This would lead to types like the following:
|
|
||||||
|
|
||||||
```Idris
|
|
||||||
-- An expression that produces a boolean
|
|
||||||
-- when all the registers are integers.
|
|
||||||
Expr (IntTy, IntTy, IntTy) BoolTy
|
|
||||||
|
|
||||||
-- An expression that produces an integer
|
|
||||||
-- when A and B are integers, and R is a boolean.
|
|
||||||
Expr (IntTy, IntTy, BoolTy) IntTy
|
|
||||||
```
|
|
||||||
|
|
||||||
In Idris, the whole definition becomes:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 18 23 >}}
|
|
||||||
|
|
||||||
The only "interesting" constructor is `Load`, which, given a register `r`,
|
|
||||||
creates an expression that has `r`'s type in the current state `s`.
|
|
||||||
|
|
||||||
### Statements
|
|
||||||
Statements are a bit different. Unlike expressions, they don't evaluate to
|
|
||||||
anything; rather, they do something. That "something" may very well be changing
|
|
||||||
the current state. We could, for instance, set `A` to be a boolean, while it was
|
|
||||||
previously an integer. So, the `Stmt` type will take two arguments: the initial
|
|
||||||
state and the final state. This leads to the following definition:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 26 29 >}}
|
|
||||||
|
|
||||||
The `Store` constructor takes a register `r` and an expression producing some type `t` in state `s`.
|
|
||||||
From these, it creates a statement that starts in `s`, and finishes
|
|
||||||
in a state similar to `s`, but with `r` now having type `t`.
|
|
||||||
|
|
||||||
The `If` constructor takes a condition, which starts in state `s` and _must_ produce
|
|
||||||
a boolean. It also takes two programs (sequences of statements), each of which
|
|
||||||
start in `s` and finishes in another state `n`. Then, the `If` constructor creates
|
|
||||||
a statement that starts in state `s`, and finishes in state `n`. Conceptually,
|
|
||||||
each branch of the `if/else` statement must result in the same final state (in terms of types);
|
|
||||||
otherwise, we wouldn't know which of the states to pick when deciding the final
|
|
||||||
state of the `If` itself.
|
|
||||||
|
|
||||||
The `Loop` constructor is even more restrictive: it takes a single program (the sequence
|
|
||||||
of instructions that it will be repeating). This program starts _and_ ends in state `s`;
|
|
||||||
since the loop can repeat many times, and since we're repeating the same program,
|
|
||||||
we want to make sure that program is always run from the same initial state.
|
|
||||||
|
|
||||||
I chose __not__ to encode `Break` as a statement. This is because we don't want `Break`s occurring
|
|
||||||
in the middle of a program! Otherwise, it would be possible to write a program
|
|
||||||
that _seems_ like it will terminate in one state, but, because of a break in the middle,
|
|
||||||
terminates in another! Instead, we'll encode `Break` as a part of the `Prog` encoding.
|
|
||||||
|
|
||||||
### Programs
|
|
||||||
A program is basically a list of statements. However, we can't use a regular Idris list for two
|
|
||||||
reasons:
|
|
||||||
|
|
||||||
1. Our type is not as simple as `[Stmt]`. We want each statement to begin in the state that the
|
|
||||||
previous statement ended in; we will have to do some work to ensure that.
|
|
||||||
2. We have two ways of ending the sequence of statements: either with or without a `break`.
|
|
||||||
Thus, instead of having a single `Nil` constructor, we'll have two.
|
|
||||||
|
|
||||||
The definition of the type turns out fairly straightforward:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 31 34 >}}
|
|
||||||
|
|
||||||
The `Nil` constructor represents an empty program (much like the built-in `Nil` represents an empty list).
|
|
||||||
Since no actions are done, it creates a `Prog` that starts and ends in the same state: `s`.
|
|
||||||
The `Break` constructor is similar; however, it represents a `break` instruction, and thus,
|
|
||||||
must be distinct from the regular `End` constructor. Finally, the `(::)` constructor, much like
|
|
||||||
the built-in `(::)` constructor, takes a statement and another program. The statement begins in
|
|
||||||
state `s` and ends in state `n`; the program after that statement must then start in state `n`,
|
|
||||||
and end in some other state `m`. The combination of the statement and the program starts in state `s`,
|
|
||||||
and finishes in state `m`; thus, `(::)` yields `Prog s m`.
|
|
||||||
|
|
||||||
This should be all! Let's try out some programs.
|
|
||||||
|
|
||||||
### Trying it Out
|
|
||||||
The following (type-correct) program compiles just fine:
|
|
||||||
|
|
||||||
{{< codelines "Idris" "typesafe-imperative/TypesafeImp.idr" 36 47 >}}
|
|
||||||
|
|
||||||
First, it loads a boolean (`True`, to be exact) into register `A`; then,
|
|
||||||
inside the `if/else` statement, it stores an integer into `A`. Finally,
|
|
||||||
it stores another integer into `B`, and adds them into `R`. Even though
|
|
||||||
`A` was a boolean at first, the type checker can deduce that it
|
|
||||||
was reset back to an integer after the `if/else`, and the program is accepted.
|
|
||||||
On the other hand, had we forgotten to set `A` to a boolean first:
|
|
||||||
|
|
||||||
```
|
|
||||||
[ If (Load A)
|
|
||||||
[ Store A (Lit 1) ]
|
|
||||||
[ Store A (Lit 2) ]
|
|
||||||
, Store B (Lit 2)
|
|
||||||
, Store R (Add (Load A) (Load B))
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
We would get a type error:
|
|
||||||
|
|
||||||
```
|
|
||||||
Type mismatch between
|
|
||||||
getRegTy A (IntTy, IntTy, IntTy)
|
|
||||||
and
|
|
||||||
BoolTy
|
|
||||||
```
|
|
||||||
|
|
||||||
The type of register `A` (that is, `IntTy`) is incompatible
|
|
||||||
with `BoolTy`. Our `initialState` says that `A` starts out as
|
|
||||||
an integer, so it can't be used in an `if/else` right away!
|
|
||||||
Similar errors occur if we make one of the branches of
|
|
||||||
the `if/else` empty, or if we set `B` to a boolean. And so,
|
|
||||||
we have an encoding of our language that allows registers to
|
|
||||||
be either integers or booleans, while still preventing
|
|
||||||
type-incorrect programs!
|
|
@ -71,10 +71,6 @@ pre code {
|
|||||||
.hl {
|
.hl {
|
||||||
display: block;
|
display: block;
|
||||||
background-color: #fffd99;
|
background-color: #fffd99;
|
||||||
|
|
||||||
.lnt::before {
|
|
||||||
content: "*";
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user