blog-static/typesafe_interpreter_revisited.md at master

16 KiB

Raw Permalink Blame History

title

date

What Stumped Me: Equality

Typechecking if-expressions is where things get interesting. First, we want to require that the condition of the expression evaluates to a boolean. For this, we can write a function requireBool, that takes a dependent pair produced by typecheck. This function does one of two things:

If the dependent pair contains a BoolType, and therefore also an expression of type SafeExpr BoolType, requireBool succeeds, and returns the expression.
If the dependent pair contains any type other than BoolType, requireBool fails with an error message. Since we're using Either for error handling, this amounts to using the Left constructor.

Such a function is quite easy to write:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 58 60 >}}

We can then write all of the recursive calls to typecheck as follows:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 71 75 >}}

Alright, so we have the types of the t and e branches. All we have to do now is use (==). We could implement (==) as follows:

implementation Eq ExprType where
  IntType == IntType = True
  BoolType == BoolType = True
  StringType == StringType = True
  _ == _ = False

Now we're golden, right? We can just write the following:



76
77
78


    if tt == et
       then pure (_ ** IfThenElse ce te ee)
       else Left "Incompatible branch types."

No, this is not quire right. Idris complains:

Type mismatch between et and tt

Huh? But we just saw that et == tt! What's the problem? The problem is, in fact, that (==) is meaningless as far as the Idris typechecker is concerned. We could have just as well written,

implementation Eq ExprType where
  _ == _ = True

This would tell us that IntType == BoolType. But of course, SafeExpr IntType is not the same as SafeExpr BoolType; I would be very worried if the typechecker allowed me to assert otherwise. There is, however, a kind of equality that we can use to convince the Idris typechecker that two types are the same. This equality, too, is a type.

Curry-Howard Correspondence

Spend enough time learning about Programming Language Theory, and you will hear the term Curry Howard Correspondence. If you're the paper kind of person, I suggest reading Philip Wadler's Propositions as Types paper. Alternatively, you can take a look at Logical Foundations' Proof Objects chapter. I will give a very brief explanation here, too, for the sake of completeness. The general gist is as follows: propositions (the logical kind) correspond to program types, and proofs of the propositions correspond to values of the types.

To get settled into this idea, let's look at a few 'well-known' examples:

(A,B), the tuple of two types A and B is equivalent to the proposition \(A \land B\), which means \(A\) and \(B\). Intuitively, to provide a proof of \(A \land B\), we have to provide the proofs of \(A\) and \(B\).
Either A B, which contains one of A or B, is equivalent to the proposition \(A \lor B\), which means \(A\) or \(B\). Intuitively, to provide a proof that either \(A\) or \(B\) is true, we need to provide one of them.
A -> B, the type of a function from A to B, is equivalent to the proposition \(A \rightarrow B\), which reads \(A\) implies \(B\). We can think of a function A -> B as creating a proof of B given a proof of A.

Now, consider Idris' unit type ():

data () = ()

This type takes no arguments, and there's only one way to construct it. We can create a value of type () at any time, by just writing (). This type is equivalent to \(\text{true}\): only one proof of it exists, and it requires no premises. It just is.

Consider also the type Void, which too is present in Idris:

-- Note: this is probably not valid code.
data Void = -- Nothing

The type Void has no constructors: it's impossible to create a value of this type, and therefore, it's impossible to create a proof of Void. Thus, as you may have guessed, Void is equivalent to \(\text{false}\).

Finally, we get to a more complicated example:

data (=) : a -> b -> Type where
   Refl : x = x

This defines a = b as a type, equivalent to the proposition that a is equal to b. The only way to construct such a type is to give it a single value x, creating the proof that x = x. This makes sense: equality is reflexive.

This definition isn't some loosey-goosey boolean-based equality! If we can construct a value of type a = b, we can prove to Idris' typechecker that a and b are equivalent. In fact, Idris' standard library gives us the following function:

replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y

This reads, given a type a, and values x and y of type a, if we know that x = y, then we can rewrite any proposition in terms of x into another, also valid proposition in terms of y. Let's make this concrete. Suppose a is Int, and P (the type of which is now Int -> Type), is Even, a proposition that claims that its argument is even. {{< sidenote "right" "specialize-note" "Then, we have:" >}} I'm only writing type signatures for replace' to avoid overloading. There's no need to define a new function; replace' is just a specialization of replace, so we can use the former anywhere we can use the latter. {{< /sidenote >}}

replace' : {x : Int} -> {y : Int} -> x = y -> Even x -> Even y

That is, if we know that x is equal to y, and we know that x is even, it follows that y is even too. After all, they're one and the same! We can take this further. Recall:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 44 44 >}}

We can therefore write:

replace'' : {x : ExprType} -> {y : ExprType} -> x = y -> SafeExpr x -> SafeExpr y

This is exactly what we want! Given a proof that one ExprType, x, is equal to another ExprType, y, we can safely convert SafeExpr x to SafeExpr y. We will use this to convince the Idris typechecker to accept our program.

First Attempt: `Eq` implies Equality

It's pretty trivial to see that we did define (==) correctly (IntType is equal to IntType, StringType is equal to StringType, and so on). Thus, if we know that x == y is True, it should follow that x = y. We can thus define the following proposition:

eqCorrect : {a : ExprType} -> {b : ExprType} -> (a == b = True) -> a = b

We will see shortly why this is not the best solution, and thus, I won't bother creating a proof / implementation for this proposition / function. It reads:

If we have a proof that (==) returned true for some ExprTypes a and b, it must be that a is the same as b.

We can then define a function to cast a SafeExpr a to SafeExpr b, given that (==) returned True for some a and b:

safeCast : {a : ExprType} -> {b : ExprType} -> (a == b = True) -> SafeExpr a -> SafeExpr b
safeCast h e = replace (eqCorrect h) e

Awesome! All that's left now is to call safeCast from our typecheck function:



76
77
78


    if tt == et
       then pure (_ ** IfThenElse ce te (safeCast ?uhOh ee))
       else Left "Incompatible branch types."

No, this doesn't work after all. What do we put for ?uhOh? We need to have a value of type tt == et = True, but we don't have one. Idris' own if-then-else expressions do not provide us with such proofs about their conditions. The awesome people at #idris pointed out that the with clause can provide such a proof. We could therefore write:

createIfThenElse ce (tt ** et) (et ** ee) with (et == tt) proof p
  | True = pure (tt ** IfThenElse ce te (safeCast p ee))
  | False = Left "Incompatible branch types."

Here, the with clause effectively adds another argument equal to (et == tt) to createIfThenElse, and tries to pattern match on its value. When we combine this with the proof keyword, Idris will give us a handle to a proof, named p, that asserts the new argument evaluates to the value in the pattern match. In our case, this is exactly the proof we need to give to safeCast.

However, this is ugly. Idris' with clause only works at the top level of a function, so we have to define a function just to use it. It also shows that we're losing information when we call (==), and we have to reconstruct or recapture it using some other means.

Second Attempt: Decidable Propositions

More awesome folks over at #idris pointed out that the whole deal with (==) is inelegant; they suggested I use decidable propositions, using the Dec type. The type is defined as follows:

data Dec : Type -> Type where
  Yes : (prf : prop) -> Dec prop
  No : (contra : prop -> Void) -> Dec prop

There are two ways to construct a value of type Dec prop:

We use the Yes constructor, which means that the proposition prop is true. To use this constructor, we have to give it a proof of prop, called prf in the constructor.
We use the No constructor, which means that the proposition prop is false. We need a proof of type prop -> Void to represent this: if we have a proof of prop, we arrive at a contradiction.

This combines the nice True and False of Bool, with the 'real' proofs of the truthfulness or falsity. At the moment that we would have been creating a boolean, we also create a proof of that boolean's value. Thus, we don't lose information. Here's how we can go about this:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 20 29 >}}

We pattern match on the input expression types. If the types are the same, we return Yes, and couple it with Refl (since we've pattern matched on the types in the left-hand side of the function definition, the typechecker has enough information to create that Refl). On the other hand, if the expression types do not match, we have to provide a proof that their equality would be absurd. For this we use helper functions / theorems like intBoolImpossible:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 11 12 >}}

I'm not sure if there's a better way of doing this than using impossible. This does the job, though: Idris understands that there's no way we can get an input of type IntType = BoolType, and allows us to skip writing a right-hand side.

We can finally use this new decEq function in our type checker:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 76 78 >}}

Idris is happy with this! We should also add IfThenElse to our eval function. This part is very easy:

{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 80 85 >}}

Since the c part of the IfThenElse is indexed with BoolType, we know that evaluating it will give us a boolean. Thus, we can use that directly in the Idris if-then-else expression. Let's try this with a few expressions:

BinOp Add (IfElse (BoolLit True) (IntLit 6) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))

This evaluates 326, as it should. What if we make the condition non-boolean?

BinOp Add (IfElse (IntLit 1) (IntLit 6) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))

Our typechecker catches this, and we end up with the following output:

Type error: Not a boolean.

Alright, let's make one of the branches of the if-expression be a boolean, while the other remains an integer.

BinOp Add (IfElse (BoolLit True) (BoolLit True) (IntLit 7)) (BinOp Multiply (IntLit 160) (IntLit 2))

Our typechecker catches this, too:

Type error: Incompatible branch types.

Conclusion

I think this is a good approach. Should we want to add more types to our language, such as tuples, lists, and so on, we will be able to extend our decEq approach to construct more complex equality proofs, and keep the typecheck method the same. Had we not used this approach, and instead decided to pattern match on types inside of typecheck, we would've quickly found that this only works for languages with finitely many types. When we add polymorphic tuples and lists, we start being able to construct an arbitrary number of types: [a]. [[a]], and so on. Then, we cease to be able to enumerate all possible pairs of types, and require a recursive solution. I think that this leads us back to decEq.

I also hope that I've now redeemed myself as far as logical arguments go. We used dependent types and made our typechecking function save us from error-checking during evaluation. We did this without having to manually create different types of expressions like ArithExpr and BoolExpr, and without having to duplicate any code.

That's all I have for today, thank you for reading! As always, you can check out the full source code for the typechecker and interpreter on my Git server.

16 KiB Raw Permalink Blame History