Add initial draft of typesafe interpreter post
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
parent
eac1151616
commit
9e399ebe3c
64
code/typesafe-interpreter/TypesafeIntr.idr
Normal file
64
code/typesafe-interpreter/TypesafeIntr.idr
Normal file
|
@ -0,0 +1,64 @@
|
||||||
|
data ExprType
|
||||||
|
= IntType
|
||||||
|
| BoolType
|
||||||
|
| StringType
|
||||||
|
|
||||||
|
repr : ExprType -> Type
|
||||||
|
repr IntType = Int
|
||||||
|
repr BoolType = Bool
|
||||||
|
repr StringType = String
|
||||||
|
|
||||||
|
data Op
|
||||||
|
= Add
|
||||||
|
| Subtract
|
||||||
|
| Multiply
|
||||||
|
| Divide
|
||||||
|
|
||||||
|
data Expr
|
||||||
|
= IntLit Int
|
||||||
|
| BoolLit Bool
|
||||||
|
| StringLit String
|
||||||
|
| BinOp Op Expr Expr
|
||||||
|
|
||||||
|
data SafeExpr : ExprType -> Type where
|
||||||
|
IntLiteral : Int -> SafeExpr IntType
|
||||||
|
BoolLiteral : Bool -> SafeExpr BoolType
|
||||||
|
StringLiteral : String -> SafeExpr StringType
|
||||||
|
BinOperation : (repr a -> repr b -> repr c) -> SafeExpr a -> SafeExpr b -> SafeExpr c
|
||||||
|
|
||||||
|
typecheckOp : Op -> (a : ExprType) -> (b : ExprType) -> Either String (c : ExprType ** repr a -> repr b -> repr c)
|
||||||
|
typecheckOp Add IntType IntType = Right (IntType ** (+))
|
||||||
|
typecheckOp Subtract IntType IntType = Right (IntType ** (-))
|
||||||
|
typecheckOp Multiply IntType IntType = Right (IntType ** (*))
|
||||||
|
typecheckOp Divide IntType IntType = Right (IntType ** div)
|
||||||
|
typecheckOp _ _ _ = Left "Invalid binary operator application"
|
||||||
|
|
||||||
|
typecheck : Expr -> Either String (n : ExprType ** SafeExpr n)
|
||||||
|
typecheck (IntLit i) = Right (_ ** IntLiteral i)
|
||||||
|
typecheck (BoolLit b) = Right (_ ** BoolLiteral b)
|
||||||
|
typecheck (StringLit s) = Right (_ ** StringLiteral s)
|
||||||
|
typecheck (BinOp o l r) = do
|
||||||
|
(lt ** le) <- typecheck l
|
||||||
|
(rt ** re) <- typecheck r
|
||||||
|
(ot ** f) <- typecheckOp o lt rt
|
||||||
|
pure (_ ** BinOperation f le re)
|
||||||
|
|
||||||
|
eval : {t : ExprType} -> SafeExpr t -> repr t
|
||||||
|
eval (IntLiteral i) = i
|
||||||
|
eval (BoolLiteral b) = b
|
||||||
|
eval (StringLiteral s) = s
|
||||||
|
eval (BinOperation f l r) = f (eval l) (eval r)
|
||||||
|
|
||||||
|
resultStr : {t : ExprType} -> repr t -> String
|
||||||
|
resultStr {t=IntType} i = show i
|
||||||
|
resultStr {t=BoolType} b = show b
|
||||||
|
resultStr {t=StringType} s = show s
|
||||||
|
|
||||||
|
tryEval : Expr -> String
|
||||||
|
tryEval ex =
|
||||||
|
case typecheck ex of
|
||||||
|
Left err => "Type error: " ++ err
|
||||||
|
Right (t ** e) => resultStr $ eval {t=t} e
|
||||||
|
|
||||||
|
main : IO ()
|
||||||
|
main = putStrLn $ tryEval $ BinOp Add (IntLit 6) (BinOp Multiply (IntLit 160) (IntLit 2))
|
151
content/blog/typesafe_interpreter.md
Normal file
151
content/blog/typesafe_interpreter.md
Normal file
|
@ -0,0 +1,151 @@
|
||||||
|
---
|
||||||
|
title: Meaningfully Typechecking a Language in Idris
|
||||||
|
date: 2020-02-27T21:58:55-08:00
|
||||||
|
draft: true
|
||||||
|
tags: ["Haskell", "Idris"]
|
||||||
|
---
|
||||||
|
|
||||||
|
This term, I'm a TA for Oregon State University's Programming Languages course.
|
||||||
|
The students in the course are tasked with using Haskell to implement a programming
|
||||||
|
language of their own design. One of the things they can do to gain points for the
|
||||||
|
project is implement type checking, rejecting
|
||||||
|
{{< sidenote "right" "ill-typed-note" "ill-typed programs or expressions" >}}
|
||||||
|
Whether or not the below example is ill-typed actually depends on your language.
|
||||||
|
Many languages (even those with a static type system, like C++ or Crystal)
|
||||||
|
have a notion of "truthy" and "falsy" values. These values can be used
|
||||||
|
in the condition of an if-expression, and will be equivalent to "true" or "false",
|
||||||
|
respectively. However, for simplicity, I will avoid including
|
||||||
|
truthy and falsy values into the languages in this post. For the same reason, I will avoid
|
||||||
|
reasoning about
|
||||||
|
<a href="https://developer.mozilla.org/en-US/docs/Glossary/Type_coercion">type coercions</a>,
|
||||||
|
which make expressions like <code>"Hello"+3</code> valid.
|
||||||
|
{{< /sidenote >}} such as:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
if "Hello" then 0 else 1
|
||||||
|
```
|
||||||
|
|
||||||
|
For instance, a student may have a function `typecheck`, with the following
|
||||||
|
signature (in Haskell):
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
typecheck :: Expr -> Either TypeError ExprType
|
||||||
|
```
|
||||||
|
|
||||||
|
The function will return an error if something goes wrong, or, if everything
|
||||||
|
goes well, the type of the given expression. So far, so good.
|
||||||
|
|
||||||
|
A student asked, however:
|
||||||
|
|
||||||
|
> Now that I ran type checking on my program, surely I don't need to include errors
|
||||||
|
in my {{< sidenote "right" "valuation-function-note" "valuation function!" >}}
|
||||||
|
I'm using "valuation function" here in the context of
|
||||||
|
<a href="https://en.wikibooks.org/wiki/Haskell/Denotational_semantics">denotational semantics</a>.
|
||||||
|
In short, a
|
||||||
|
<a href="http://www.inf.ed.ac.uk/teaching/courses/inf2a/readings/semantics-note.pdf">valuation function</a>
|
||||||
|
takes an expression and assigns to it some
|
||||||
|
representation of its meaning. For a language of arithmetic expression, the
|
||||||
|
"meaning" of an expression is just a number (the result of simplifying the expression).
|
||||||
|
For a language of booleans, <code>and</code>, and <code>or</code>, the "meaning" is a boolean
|
||||||
|
for the same reason. Since an expression in the language can be ill-formed (like
|
||||||
|
<code>list(5)</code> in Python), the "meaning" (<em>semantic domain</em>) of a
|
||||||
|
complicated language tends to include the possibility of errors.
|
||||||
|
{{< /sidenote >}} I should be able to make my function be of type `Expr -> Val`, and not
|
||||||
|
`Expr -> Maybe Val`!
|
||||||
|
|
||||||
|
Unfortunately, this is not quite true. It is true that if the student's type checking
|
||||||
|
function is correct, then there will be no way for a type error to occur during
|
||||||
|
the evaluation of an expression "validated" by said function. The issue is, though,
|
||||||
|
that __the type system does not know about the expression's type-correctness__. Haskell
|
||||||
|
doesn't know that an expression has been type checked; worse, since the function's type
|
||||||
|
indicates that it accepts `Expr`, it must handle invalid expressions to avoid being [partial](https://wiki.haskell.org/Partial_functions). In short, even if we __know__ that the
|
||||||
|
expressions we give to a function are type safe, we have no way of enforcing this.
|
||||||
|
|
||||||
|
A potential solution offered in class was to separate the expressions into several
|
||||||
|
data types, `BoolExpr`, `ArithExpr`, and finally, a more general `Expr'` that can
|
||||||
|
be constructed from the first two. Operations such as `and` and `or`
|
||||||
|
will then only be applicable to boolean expressions:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
data BoolExpr = BoolLit Bool | And BoolExpr BoolExpr | Or BoolExpr BoolExpr
|
||||||
|
```
|
||||||
|
|
||||||
|
It will be a type error to represent an expression such as `true or 5`. Then,
|
||||||
|
`Expr'` may have a constructor such as `IfElse` that only accepts a boolean
|
||||||
|
expression as the first argument:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
data Expr' = IfElse BoolExpr Expr' Expr' | ...
|
||||||
|
```
|
||||||
|
|
||||||
|
All seems well. Now, it's impossible to have a non-boolean condition, and thus,
|
||||||
|
this error has been eliminated from the evaluator. Maybe we can even have
|
||||||
|
our type checking function translate an unsafe, potentially incorrect `Expr` into
|
||||||
|
a more safe `Expr'`:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
typecheck :: Expr -> Either TypeError (Expr', ExprType)
|
||||||
|
```
|
||||||
|
|
||||||
|
However, we typically also want the branches of an if expression to both have the same
|
||||||
|
type - `if x then 3 else False` may work sometimes, but not always, depending of the
|
||||||
|
value of `x`. How do we encode this? Do we have two constructors, `IfElseBool` and
|
||||||
|
`IfElseInt`, with one `BoolExpr` and the other in `ArithExpr`? What if we add strings?
|
||||||
|
We'll be copying functionality back and forth, and our code will suffer. Wouldn't it be
|
||||||
|
nice if we could somehow tag our expressions with the type they produce? Instead of
|
||||||
|
`BoolExpr` and `ArithExpr`, we would be able to have `Expr BoolType` and `Expr IntType`,
|
||||||
|
which would share the `IfElse` constructor...
|
||||||
|
|
||||||
|
It's not easy to do this in canonical Haskell, but it can be done in Idris!
|
||||||
|
|
||||||
|
### Enter Dependent Types
|
||||||
|
Idris is a language with support for [dependent types](https://en.wikipedia.org/wiki/Dependent_type). Wikipedia gives the following definition for "dependent type":
|
||||||
|
|
||||||
|
> In computer science and logic, a dependent type is a type whose definition depends on a value.
|
||||||
|
|
||||||
|
This is exactly what we want. In Idris, we can define the possible set of types in our
|
||||||
|
language:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 1 4>}}
|
||||||
|
|
||||||
|
Then, we can define a `SafeExpr` type family, which is indexed by `ExprType`.
|
||||||
|
Here's the
|
||||||
|
{{< sidenote "right" "gadt-note" "code," >}}
|
||||||
|
I should probably note that the definition of <code>SafeExpr</code> is that of
|
||||||
|
a
|
||||||
|
<a href="https://en.wikipedia.org/wiki/Generalized_algebraic_data_type">Generalized Algebraic Data Type</a>,
|
||||||
|
or GADT for short. This is what allows each of our constructors to produce
|
||||||
|
values of a different type: <code>IntLiteral</code> builds <code>SafeExpr IntType</code>,
|
||||||
|
while <code>BoolLiteral</code> builds <code>SafeExpr BoolType</code>.
|
||||||
|
{{</ sidenote >}} which we will discuss below:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 23 27 >}}
|
||||||
|
|
||||||
|
The first line of the above snippet says, "`SafeExpr` is a type constructor
|
||||||
|
that requires a value of type `ExprType`". For example, we can have
|
||||||
|
`SafeExpr IntType`, or `SafeExpr BoolType`. Next, we have to define constructors
|
||||||
|
for `SafeExpr`. One such constructor is `IntLiteral`, which takes a value of
|
||||||
|
type `Int` (which represents the value of the integer literal), and builds
|
||||||
|
a value of `SafeExpr IntType`, that is, an expression that __we know evaluates
|
||||||
|
to an integer__.
|
||||||
|
|
||||||
|
The same is the case for `BoolLiteral` and `StringLiteral`, only they build
|
||||||
|
values of type `SafeExpr BoolType` and `SafeExpr StringType`, respectively.
|
||||||
|
|
||||||
|
The more complicated case is that of `BinOperation`. Put simply, it takes
|
||||||
|
a binary function of type `a->b->c` (kind of), two `SafeExpr`s producing `a` and `b`,
|
||||||
|
and combines the values of those expressions using the function to generate
|
||||||
|
a value of type `c`. Since the whole expression returns `c`, `BinOperation`
|
||||||
|
builds a value of type `SafeExpr c`.
|
||||||
|
|
||||||
|
That's almost it. Except, what's up with `repr`? We need it because `SafeExpr`
|
||||||
|
is parameterized by a __value__ of type `ExprType`. Thus, `a`, `b`, and `c` are
|
||||||
|
all values in the definition of `BinOperation`. However, in a function
|
||||||
|
`input->output`, both `input` and `output` have to be __types__, not values.
|
||||||
|
Thus, we define a function `repr` which converts values such as `IntType` into
|
||||||
|
the actual type that `eval` would yield when running our expression:
|
||||||
|
|
||||||
|
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntr.idr" 6 9 >}}
|
||||||
|
|
||||||
|
The power of dependent types allows us to run `repr` inside the type
|
||||||
|
of `BinOp` to compute the type of the function it must accept.
|
Loading…
Reference in New Issue
Block a user