From 020417e971b782251797fd2ec0fbdc9a6d5cfe78 Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Wed, 12 Aug 2020 01:38:22 -0700 Subject: [PATCH] Add draft of new Idris typechecking post. This one uses line highlights! --- content/blog/typesafe_interpreter_tuples.md | 217 ++++++++++++++++++++ 1 file changed, 217 insertions(+) create mode 100644 content/blog/typesafe_interpreter_tuples.md diff --git a/content/blog/typesafe_interpreter_tuples.md b/content/blog/typesafe_interpreter_tuples.md new file mode 100644 index 0000000..9e9e6ea --- /dev/null +++ b/content/blog/typesafe_interpreter_tuples.md @@ -0,0 +1,217 @@ +--- +title: Meaningfully Typechecking a Language in Idris, With Tuples +date: 2020-08-11T19:57:26-07:00 +tags: ["Idris"] +draft: true +--- + +Some time ago, I wrote a post titled +[Meaningfully Typechecking a Language in Idris]({{< relref "typesafe_interpreter.md" >}}). +I then followed it up with +[Meaningfully Typechecking a Language in Idris, Revisited]({{< relref "typesafe_interpreter_revisited.md" >}}). +In these posts, I described a hypothetical +way of 'typechecking' an expression data type `Expr` into a typesafe form `SafeExpr`. +A `SafeExpr` can be evaluated without any code to handle type errors, +since it's by definition impossible to construct ill-typed expressions using +it. In the first post, we implemented the method only for simple arithmetic +expressions; in my latter post, we extended this to support `if`-expressions. +Near the end of the post, I made the following comment: + +> When we add polymorphic tuples and lists, we start being able to construct an +arbitrary number of types: `[a]`. `[[a]]`, and so on. Then, we cease to be able t +enumerate all possible pairs of types, and require a recursive solution. I think +that this leads us back to [our method]. + +Recently, I thought about this some more, and decided that it's rather simple +to add tuples into our little language. The addition of tuples mean that our +language will have an infinite number of possible types. We would have +`Int`, `(Int, Int)`, `((Int, Int), Int)`, and so on. This would make it +impossible to manually test every possible case in our typechecker, +but our approach of returning `Dec (a = b)` will work just fine. + +### Extending The Syntax +First, let's extend our existing language with expressions fpr +tuples. For simplicity, let's use pairs `(a,b)` instead of general +`n`-element tuples. This would make typechecking less cumbersome while still +having the interesting effect of making the number of types in our language +infinite. We can always represent the 3-element tuple `(a,b,c)` as `((a,b), c)`, +after all. To be able to extract values from our pairs, we'll add the `fst` and +`snd` functions into our language, which accept a tuple and return its +first or second element, respectively. + +Our `Expr` data type, which allows ill-typed expressions, ends up as follows: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 31 39 "hl_lines=7 8 9" >}} + +I've highlighted the new lines. The additions consist of the `Pair` constructor, which +represents the tuple expression `(a, b)`, and the `Fst` and `Snd` constructors, +which represent the `fst e` and `snd e` expressions, respectively. In +a similar vein, we extend our `SafeExpr` GADT: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 41 49 "hl_lines=7 8 9" >}} + +Finally, to provide the `PairType` constructor, we extend the `ExprType` and `repr` functions: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 1 11 "hl_lines=5 11" >}} + +### Implementing Equality +An important part of this change is the extension of the `decEq` function, +which compares two types for equality. The kind folks over at `#idris` previously +recommended the use of the `Dec` data type for this purpose. A value of +type `Dec P` +{{< sidenote "right" "decideable-note" "is either a proof that \(P\) is true, or a proof that \(P\) is false." >}} +It's possible that a proposition \(P\) is not provable, and neither is \(\lnot P\). +It is therefore not possible to construct a value of type Dec P for +any proposition P. Having a value of type Dec P, then, +provides us nontrivial information. +{{< /sidenote >}} Our `decEq` function, given two types `a` and `b`, returns +`Dec (a = b)`. Thus, it will return either a proof that `a = b` (which we can +then use to convince the Idris type system that two `SafeExpr` values are, +in fact, of the same type), or a proof of `a = b -> Void` (which tells +us that `a` and `b` are definitely not equal). If you're not sure what the deal with `(=)` +and `Void` is, check out +[this section]({{< relref "typesafe_interpreter_revisited.md" >}}#curry-howard-correspondence) +of the previous article. + +A lot of the work in implementing `decEq` went into constructing proofs of falsity. +That is, we needed to explicitly list every case like `decEq IntType BoolType`, and create +a proof that `IntType` cannot equal `BoolType`. However, here's how we use `decEq` in +the typechecking function: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 76 78 >}} + +We always throw away the proof inequality! So, rather than spending the time +constructing useless proofs like this, we can just switch `decEq` to return +a `Maybe (a = b)`. The `Just` case will tell us that the two types are equal +(and, as before, provide a proof); the `Nothing` case will tell us that +the two types are _not_ equal, and provide no further information. Let's +see the implementation of `decEq` now: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 13 23 >}} + +Lines 14 through 16 are pretty simple; in this case, we can tell at a glance +that the two types are equal, and Idris can infer an equality proof in +the form of `Refl`. We return this proof by writing it in a `Just`. +Line 23 is the catch-all case for any combination of types we didn't handle. +Any combination of types we don't handle is false, and thus, this case +returns `Nothing`. + +What about lines 17 through 22? This is the case for handling the equality +of two pair types, `(lt1, lt2)` and `(rt1, rt2)`. The equality of the two +types depends on the equality of their constituents. That is, if we +know that `lt1 = rt1` and `lt2 = rt2`, we know that the two pair types +are also equal. If one of the two equalities doesn't hold, the two +pairs obviously aren't equal, and thus, we should return `Nothing`. +This should remind us of `Maybe`'s monadic nature: we can first compute +`decEq lt1 rt1`, and then, if it succeeds, compute `decEq lt2 rt2`. +If both succeed, we will have in hand the two proofs, `lt1 = rt1` +and `lt2 = rt2`. We achieve this effect using `do`-notation, +storing the sub-proofs into `subEq1` and `subEq2`. + +What now? Once again, we have to use `replace`. Recall its type: + +```Idris +replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y +``` + +Given some proposition in terms of `a`, and knowing that `a = b`, `replace` +returns the original proposition, but now in terms of `b`. We know for sure +that: + +```Idris +PairType lt1 lt2 = PairType lt1 lt2 +``` + +We can start from there. Let's handle one thing at a time, and try +to replace the second `lt1` with `rt1`. Then, we can replace the second +`lt2` with `rt2`, and we'll have our equality! + +Easier said than done, though. How do we tell Idris which `lt1` +we want to substitute? After all, of the following are possible: + +```Idris +PairType rt1 lt2 = PairType lt1 lt2 -- First lt1 replaced +PairType lt1 lt2 = PairType rt1 lt2 -- Second lt2 replaced +PairType rt1 lt2 = PairType rt1 lt2 -- Both replaced +``` + +The key is in the signature, specifically the expressions `P x` and `P y`. +We can think of `P` as a function, and of `replace` as creating a value +of `P` applied to another argument. Thus, the substitution will occur +exactly where the argument of `P` is used. Then, to achieve each +of the above substitution, we can write `P` as follows: + +```Idris {linenos=table, hl_lines=[2]} +t1 => PairType t1 lt2 = PairType lt1 lt2 +t1 => PairType lt1 lt2 = PairType t1 lt2 +t1 => PairType t1 lt2 = PairType t1 lt2 +``` + +The highlighted function is the one we'll need to use to attain +the desired result. Since `P` is an implicit argument to `replace`, +we can explicitly provide it with `{P=...}`, leading to the following +line: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 20 20>}} + +We now have a proof of the following proposition: + +```Idris +PairType lt1 lt2 = PairType rt1 lt2 +``` + +We want to replace the second `lt2` with `rt2`, which means that we +write our `P` as follows: + +```Idris +t2 => PairType lt1 lt2 = PairType rt1 t2 +``` + +Finally, we perform the second replacement, and return the result: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 21 22 >}} + +Great! We have finished implement `decEq`. + +### Adjusting The Typechecker +It's time to make our typechecker work with tuples. +First, we need to fix the `IfElse` case to accept `Maybe (a=b)` instead +of `Dec (a=b)`: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 71 78 "hl_lines=7 8" >}} + +Note that the only change is from `Dec` to `Maybe`; we didn't need to add new cases +or even to know what sort of types are available in the language. + +Next, we can write the cases for the new expressions in our language. We can +start with `Pair`, which, given expressions of types `a` and `b`, creates +an expression of type `(a,b)`. As long as the arguments to `Pair` are well-typed, +so is the `Pair` expression itself; thus, there are no errors to handle. + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 79 83 >}} + +The case for `Fst` is more complicated. If the argument to `Fst` is a tuple +of type `(a, b)`, then `Fst` constructs from it an expression +of type `a`. Otherwise, the expression is ill-typed, and we return an error. + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 84 89 >}} + +The case for `Snd` is very similar: + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 90 96 >}} + +### Evaluation Function and Conclusion +We conclude with our final `eval` and `resultStr` functions, +which now look as follows. + +{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 97 111 "hl_lines=7-9 13-15" >}} + +As you can see, we require no error handling in `eval`; the expressions returned by +`typecheck` are guaranteed to evaluate to valid Idris values. We have achieved our goal, +with very little changes to `typecheck` other than the addition of new language +constructs. In my opinion, this is a win! + +As always, you can see the code on my Git server. Here's +[the latest Idris file,](https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/typesafe-interpreter/TypesafeIntrV3.idr) +if you want to check it out (and maybe verify that it compiles). I hope you found +this interesting!