Add initial draft of typesafe interpreter post
All checks were successful
continuousintegration/drone/push Build is passing
All checks were successful
continuousintegration/drone/push Build is passing
This commit is contained in:
parent
eac1151616
commit
9e399ebe3c
64
code/typesafeinterpreter/TypesafeIntr.idr
Normal file
64
code/typesafeinterpreter/TypesafeIntr.idr
Normal file

@ 0,0 +1,64 @@


data ExprType


= IntType


 BoolType


 StringType




repr : ExprType > Type


repr IntType = Int


repr BoolType = Bool


repr StringType = String




data Op


= Add


 Subtract


 Multiply


 Divide




data Expr


= IntLit Int


 BoolLit Bool


 StringLit String


 BinOp Op Expr Expr




data SafeExpr : ExprType > Type where


IntLiteral : Int > SafeExpr IntType


BoolLiteral : Bool > SafeExpr BoolType


StringLiteral : String > SafeExpr StringType


BinOperation : (repr a > repr b > repr c) > SafeExpr a > SafeExpr b > SafeExpr c




typecheckOp : Op > (a : ExprType) > (b : ExprType) > Either String (c : ExprType ** repr a > repr b > repr c)


typecheckOp Add IntType IntType = Right (IntType ** (+))


typecheckOp Subtract IntType IntType = Right (IntType ** ())


typecheckOp Multiply IntType IntType = Right (IntType ** (*))


typecheckOp Divide IntType IntType = Right (IntType ** div)


typecheckOp _ _ _ = Left "Invalid binary operator application"




typecheck : Expr > Either String (n : ExprType ** SafeExpr n)


typecheck (IntLit i) = Right (_ ** IntLiteral i)


typecheck (BoolLit b) = Right (_ ** BoolLiteral b)


typecheck (StringLit s) = Right (_ ** StringLiteral s)


typecheck (BinOp o l r) = do


(lt ** le) < typecheck l


(rt ** re) < typecheck r


(ot ** f) < typecheckOp o lt rt


pure (_ ** BinOperation f le re)




eval : {t : ExprType} > SafeExpr t > repr t


eval (IntLiteral i) = i


eval (BoolLiteral b) = b


eval (StringLiteral s) = s


eval (BinOperation f l r) = f (eval l) (eval r)




resultStr : {t : ExprType} > repr t > String


resultStr {t=IntType} i = show i


resultStr {t=BoolType} b = show b


resultStr {t=StringType} s = show s




tryEval : Expr > String


tryEval ex =


case typecheck ex of


Left err => "Type error: " ++ err


Right (t ** e) => resultStr $ eval {t=t} e




main : IO ()


main = putStrLn $ tryEval $ BinOp Add (IntLit 6) (BinOp Multiply (IntLit 160) (IntLit 2))

151
content/blog/typesafe_interpreter.md
Normal file
151
content/blog/typesafe_interpreter.md
Normal file

@ 0,0 +1,151 @@





title: Meaningfully Typechecking a Language in Idris


date: 20200227T21:58:5508:00


draft: true


tags: ["Haskell", "Idris"]







This term, I'm a TA for Oregon State University's Programming Languages course.


The students in the course are tasked with using Haskell to implement a programming


language of their own design. One of the things they can do to gain points for the


project is implement type checking, rejecting


{{< sidenote "right" "illtypednote" "illtyped programs or expressions" >}}


Whether or not the below example is illtyped actually depends on your language.


Many languages (even those with a static type system, like C++ or Crystal)


have a notion of "truthy" and "falsy" values. These values can be used


in the condition of an ifexpression, and will be equivalent to "true" or "false",


respectively. However, for simplicity, I will avoid including


truthy and falsy values into the languages in this post. For the same reason, I will avoid


reasoning about


<a href="https://developer.mozilla.org/enUS/docs/Glossary/Type_coercion">type coercions</a>,


which make expressions like <code>"Hello"+3</code> valid.


{{< /sidenote >}} such as:




```Haskell


if "Hello" then 0 else 1


```




For instance, a student may have a function `typecheck`, with the following


signature (in Haskell):




```Haskell


typecheck :: Expr > Either TypeError ExprType


```




The function will return an error if something goes wrong, or, if everything


goes well, the type of the given expression. So far, so good.




A student asked, however:




> Now that I ran type checking on my program, surely I don't need to include errors


in my {{< sidenote "right" "valuationfunctionnote" "valuation function!" >}}


I'm using "valuation function" here in the context of


<a href="https://en.wikibooks.org/wiki/Haskell/Denotational_semantics">denotational semantics</a>.


In short, a


<a href="http://www.inf.ed.ac.uk/teaching/courses/inf2a/readings/semanticsnote.pdf">valuation function</a>


takes an expression and assigns to it some


representation of its meaning. For a language of arithmetic expression, the


"meaning" of an expression is just a number (the result of simplifying the expression).


For a language of booleans, <code>and</code>, and <code>or</code>, the "meaning" is a boolean


for the same reason. Since an expression in the language can be illformed (like


<code>list(5)</code> in Python), the "meaning" (<em>semantic domain</em>) of a


complicated language tends to include the possibility of errors.


{{< /sidenote >}} I should be able to make my function be of type `Expr > Val`, and not


`Expr > Maybe Val`!




Unfortunately, this is not quite true. It is true that if the student's type checking


function is correct, then there will be no way for a type error to occur during


the evaluation of an expression "validated" by said function. The issue is, though,


that __the type system does not know about the expression's typecorrectness__. Haskell


doesn't know that an expression has been type checked; worse, since the function's type


indicates that it accepts `Expr`, it must handle invalid expressions to avoid being [partial](https://wiki.haskell.org/Partial_functions). In short, even if we __know__ that the


expressions we give to a function are type safe, we have no way of enforcing this.




A potential solution offered in class was to separate the expressions into several


data types, `BoolExpr`, `ArithExpr`, and finally, a more general `Expr'` that can


be constructed from the first two. Operations such as `and` and `or`


will then only be applicable to boolean expressions:




```Haskell


data BoolExpr = BoolLit Bool  And BoolExpr BoolExpr  Or BoolExpr BoolExpr


```




It will be a type error to represent an expression such as `true or 5`. Then,


`Expr'` may have a constructor such as `IfElse` that only accepts a boolean


expression as the first argument:




```Haskell


data Expr' = IfElse BoolExpr Expr' Expr'  ...


```




All seems well. Now, it's impossible to have a nonboolean condition, and thus,


this error has been eliminated from the evaluator. Maybe we can even have


our type checking function translate an unsafe, potentially incorrect `Expr` into


a more safe `Expr'`:




```Haskell


typecheck :: Expr > Either TypeError (Expr', ExprType)


```




However, we typically also want the branches of an if expression to both have the same


type  `if x then 3 else False` may work sometimes, but not always, depending of the


value of `x`. How do we encode this? Do we have two constructors, `IfElseBool` and


`IfElseInt`, with one `BoolExpr` and the other in `ArithExpr`? What if we add strings?


We'll be copying functionality back and forth, and our code will suffer. Wouldn't it be


nice if we could somehow tag our expressions with the type they produce? Instead of


`BoolExpr` and `ArithExpr`, we would be able to have `Expr BoolType` and `Expr IntType`,


which would share the `IfElse` constructor...




It's not easy to do this in canonical Haskell, but it can be done in Idris!




### Enter Dependent Types


Idris is a language with support for [dependent types](https://en.wikipedia.org/wiki/Dependent_type). Wikipedia gives the following definition for "dependent type":




> In computer science and logic, a dependent type is a type whose definition depends on a value.




This is exactly what we want. In Idris, we can define the possible set of types in our


language:




{{< codelines "Idris" "typesafeinterpreter/TypesafeIntr.idr" 1 4>}}




Then, we can define a `SafeExpr` type family, which is indexed by `ExprType`.


Here's the


{{< sidenote "right" "gadtnote" "code," >}}


I should probably note that the definition of <code>SafeExpr</code> is that of


a


<a href="https://en.wikipedia.org/wiki/Generalized_algebraic_data_type">Generalized Algebraic Data Type</a>,


or GADT for short. This is what allows each of our constructors to produce


values of a different type: <code>IntLiteral</code> builds <code>SafeExpr IntType</code>,


while <code>BoolLiteral</code> builds <code>SafeExpr BoolType</code>.


{{</ sidenote >}} which we will discuss below:




{{< codelines "Idris" "typesafeinterpreter/TypesafeIntr.idr" 23 27 >}}




The first line of the above snippet says, "`SafeExpr` is a type constructor


that requires a value of type `ExprType`". For example, we can have


`SafeExpr IntType`, or `SafeExpr BoolType`. Next, we have to define constructors


for `SafeExpr`. One such constructor is `IntLiteral`, which takes a value of


type `Int` (which represents the value of the integer literal), and builds


a value of `SafeExpr IntType`, that is, an expression that __we know evaluates


to an integer__.




The same is the case for `BoolLiteral` and `StringLiteral`, only they build


values of type `SafeExpr BoolType` and `SafeExpr StringType`, respectively.




The more complicated case is that of `BinOperation`. Put simply, it takes


a binary function of type `a>b>c` (kind of), two `SafeExpr`s producing `a` and `b`,


and combines the values of those expressions using the function to generate


a value of type `c`. Since the whole expression returns `c`, `BinOperation`


builds a value of type `SafeExpr c`.




That's almost it. Except, what's up with `repr`? We need it because `SafeExpr`


is parameterized by a __value__ of type `ExprType`. Thus, `a`, `b`, and `c` are


all values in the definition of `BinOperation`. However, in a function


`input>output`, both `input` and `output` have to be __types__, not values.


Thus, we define a function `repr` which converts values such as `IntType` into


the actual type that `eval` would yield when running our expression:




{{< codelines "Idris" "typesafeinterpreter/TypesafeIntr.idr" 6 9 >}}




The power of dependent types allows us to run `repr` inside the type


of `BinOp` to compute the type of the function it must accept.

Loading…
Reference in New Issue
Block a user