Compare commits
2 Commits
9e399ebe3c
...
ae9805e4f2
Author | SHA1 | Date | |
---|---|---|---|
ae9805e4f2 | |||
33b1457e91 |
|
@ -43,7 +43,7 @@ typecheck (BinOp o l r) = do
|
|||
(ot ** f) <- typecheckOp o lt rt
|
||||
pure (_ ** BinOperation f le re)
|
||||
|
||||
eval : {t : ExprType} -> SafeExpr t -> repr t
|
||||
eval : SafeExpr t -> repr t
|
||||
eval (IntLiteral i) = i
|
||||
eval (BoolLiteral b) = b
|
||||
eval (StringLiteral s) = s
|
||||
|
@ -58,7 +58,7 @@ tryEval : Expr -> String
|
|||
tryEval ex =
|
||||
case typecheck ex of
|
||||
Left err => "Type error: " ++ err
|
||||
Right (t ** e) => resultStr $ eval {t=t} e
|
||||
Right (t ** e) => resultStr $ eval {t} e
|
||||
|
||||
main : IO ()
|
||||
main = putStrLn $ tryEval $ BinOp Add (IntLit 6) (BinOp Multiply (IntLit 160) (IntLit 2))
|
||||
|
|
|
@ -1,7 +1,6 @@
|
|||
---
|
||||
title: Meaningfully Typechecking a Language in Idris
|
||||
date: 2020-02-27T21:58:55-08:00
|
||||
draft: true
|
||||
tags: ["Haskell", "Idris"]
|
||||
---
|
||||
|
||||
|
@ -87,10 +86,10 @@ a more safe `Expr'`:
|
|||
typecheck :: Expr -> Either TypeError (Expr', ExprType)
|
||||
```
|
||||
|
||||
However, we typically also want the branches of an if expression to both have the same
|
||||
However, we typically also want the branches of an if-expression to both have the same
|
||||
type - `if x then 3 else False` may work sometimes, but not always, depending of the
|
||||
value of `x`. How do we encode this? Do we have two constructors, `IfElseBool` and
|
||||
`IfElseInt`, with one `BoolExpr` and the other in `ArithExpr`? What if we add strings?
|
||||
`IfElseInt`, with one in `BoolExpr` and the other in `ArithExpr`? What if we add strings?
|
||||
We'll be copying functionality back and forth, and our code will suffer. Wouldn't it be
|
||||
nice if we could somehow tag our expressions with the type they produce? Instead of
|
||||
`BoolExpr` and `ArithExpr`, we would be able to have `Expr BoolType` and `Expr IntType`,
|
||||
|
@ -149,3 +148,122 @@ the actual type that `eval` would yield when running our expression:
|
|||
|
||||
The power of dependent types allows us to run `repr` inside the type
|
||||
of `BinOp` to compute the type of the function it must accept.
|
||||
|
||||
Now, we have a way to represent expressions that are guaranteed to be type safe.
|
||||
With this, we can make our `typeheck` function convert an `Expr` to a `SafeExpr`.
|
||||
Wait a minute, though! We can't _just_ return `SafeExpr`: it's a type constructor!
|
||||
We need to somehow return `SafeExpr a`, where `a` is a value of type `ExprType`. But
|
||||
it doesn't make sense for the return type to have a new type variable that didn't
|
||||
occur in the rest of the type signature. It would be ideal if we could return both
|
||||
the type of the expression, and a `SafeExpr` of that type.
|
||||
|
||||
In fact, we can!
|
||||
|
||||
Idris has something called _dependent pairs_, which are like normal pairs, but
|
||||
in which the type of the second element depends on the value of the first element.
|
||||
The canonical example of this is a pair of (list length, list of that many elements).
|
||||
For instance, in Idris, we can write:
|
||||
|
||||
```Idris
|
||||
listPair : (n : Nat ** Vec n Int)
|
||||
```
|
||||
|
||||
In the above snippet, we declare the type for a pair of a natural number (`n : Nat`) and a list of
|
||||
integers (`Vect n Int`), where the number of elements in the list is equal to the natural number. Let's
|
||||
try applying this to our problem. We want to return an `ExprType`, and a `SafeExpr` which
|
||||
depends on that `ExprType`. How about this:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 36 36 >}}
|
||||
|
||||
Given an expression, we return either an error (`String`) or a dependent pair, which
|
||||
contains some `ExprType` `n` and a `SafeExpr` that evaluates to a value of type `n`.
|
||||
We can even start implementing this function, starting with literals:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 37 39 >}}
|
||||
|
||||
Note the use of `_`. Since `IntLiteral` always produces `SafeExpr IntType`, we allow
|
||||
Idris to infer that `IntType` must be the first element of the tuple. This is easy
|
||||
enough, because a boolean, integer, or string literal can never be type-incorrect.
|
||||
|
||||
The interesting case is that of a binary operation. Is `"hello" * 3` invalid? It might
|
||||
be, but some languages evaluate the multiplication of a string by a number as repeating
|
||||
the string that many times: `"hellohellohello"`. It is up to us, the language designers,
|
||||
to specify the set of valid operations. Furthermore, observe that `BinOperation`
|
||||
takes a _function_ as its first argument, not an `Op`. To guarantee that we can,
|
||||
in fact, evaluate a `BinOperation` to the promised type, we require that the means
|
||||
of performing the evaluation is included in the expression. Thus, when we convert `Expr`
|
||||
to `SafeExpr`, we need to convert an `Op` to a corresponding function. As we can see
|
||||
with `"hello"*3` and `163*2`, an `Op` can correspond to a different function
|
||||
depending on the types of its inputs. To deal with this, we define a new function `typecheckOp`,
|
||||
which takes an `Op` and two expression types, and returns either an error (if the `Op` can't
|
||||
be applied to those types) or a dependent pair containing the output type of the operation
|
||||
and a function that performs the required computation. That's a mouthful; let's look at the code:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 29 34 >}}
|
||||
|
||||
When `(+)` is applied to two integers, this is not an error, and the result is also
|
||||
an integer. To perform addition, we use Idris' built-in function `(+)`. The
|
||||
same is true for all other arithmetic operations in this example. In all other
|
||||
cases, we simply return an error. We can now use `typecheckOp` in our `typecheck`
|
||||
function:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 40 44 >}}
|
||||
|
||||
Here, we use do-notation to first type check first the left, then the right subexpression.
|
||||
Since the result of type checking the subexpressions gives us their output types,
|
||||
we can feed these types, together with `o`, to `typecheckOp` to determine the output
|
||||
type and the applicable evaluation function. Finally, we assemble the new `SafeExpr`
|
||||
from the function and the two translated subexpressions.
|
||||
|
||||
Alright, we've done all this work. Is it worth it? Let's try implementing `eval`:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 46 50 >}}
|
||||
|
||||
That's it! No `Maybe`, no error cases. `eval` is completely total, but doesn't require
|
||||
error handling because it __knows that the expression it is evaluating is type-correct__!
|
||||
|
||||
Let's run all of this. We'll need some code to print the result of evaluating an expression.
|
||||
Here's all that:
|
||||
|
||||
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 52 64 >}}
|
||||
|
||||
And the output is:
|
||||
|
||||
```
|
||||
>>> idris TypesafeIntr.idr -o typesafe
|
||||
>>> ./typesafe
|
||||
326
|
||||
```
|
||||
|
||||
That's right! What about a type-incorrect example?
|
||||
|
||||
```Idris
|
||||
BinOp Add (IntLit 6) (BinOp Multiply (IntLit 160) (StringLit "hi"))
|
||||
```
|
||||
|
||||
The program reports:
|
||||
|
||||
```
|
||||
Type error: Invalid binary operator application
|
||||
```
|
||||
|
||||
Awesome!
|
||||
|
||||
### Wrapping Up
|
||||
In this post, we learned that type checking can be used to translate an expression into
|
||||
a more strongly-typed data type, which can be (more) safe to evaluate. To help
|
||||
strengthen the types of
|
||||
{{< sidenote "right" "if-else-note" "our expression language," >}}
|
||||
You may be thinking, "but where did the if-expressions go?". It turns out
|
||||
that making sure that the branches of an if-expression are of the same
|
||||
type is actually a fairly difficult task; the best way I found was
|
||||
enumerating all the possible "valid" combinations of types in a case-expression.
|
||||
Since this is obviously not the right solution, I decided to publish what I have,
|
||||
and look for an alternative. If I find a better solution, I will write a follow-up
|
||||
post.
|
||||
{{< /sidenote >}} we used the Idris language and
|
||||
its support for dependent types and Generalized Algebraic Data Types (GADTs).
|
||||
I hope this was interesting!
|
||||
|
||||
As usual, you can find the code for this post in this website's Git repository. The
|
||||
source file we went through today is found [here](https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/typesafe-interpreter/TypesafeIntr.idr).
|
||||
|
|
Loading…
Reference in New Issue
Block a user