Add draft of new Idris typechecking post.

This one uses line highlights!
This commit is contained in:
Danila Fedorin 2020-08-12 01:38:22 -07:00
parent eff0de5330
commit 020417e971
1 changed files with 217 additions and 0 deletions

View File

@ -0,0 +1,217 @@
title: Meaningfully Typechecking a Language in Idris, With Tuples
date: 2020-08-11T19:57:26-07:00
tags: ["Idris"]
draft: true
Some time ago, I wrote a post titled
[Meaningfully Typechecking a Language in Idris]({{< relref "" >}}).
I then followed it up with
[Meaningfully Typechecking a Language in Idris, Revisited]({{< relref "" >}}).
In these posts, I described a hypothetical
way of 'typechecking' an expression data type `Expr` into a typesafe form `SafeExpr`.
A `SafeExpr` can be evaluated without any code to handle type errors,
since it's by definition impossible to construct ill-typed expressions using
it. In the first post, we implemented the method only for simple arithmetic
expressions; in my latter post, we extended this to support `if`-expressions.
Near the end of the post, I made the following comment:
> When we add polymorphic tuples and lists, we start being able to construct an
arbitrary number of types: `[a]`. `[[a]]`, and so on. Then, we cease to be able t
enumerate all possible pairs of types, and require a recursive solution. I think
that this leads us back to [our method].
Recently, I thought about this some more, and decided that it's rather simple
to add tuples into our little language. The addition of tuples mean that our
language will have an infinite number of possible types. We would have
`Int`, `(Int, Int)`, `((Int, Int), Int)`, and so on. This would make it
impossible to manually test every possible case in our typechecker,
but our approach of returning `Dec (a = b)` will work just fine.
### Extending The Syntax
First, let's extend our existing language with expressions fpr
tuples. For simplicity, let's use pairs `(a,b)` instead of general
`n`-element tuples. This would make typechecking less cumbersome while still
having the interesting effect of making the number of types in our language
infinite. We can always represent the 3-element tuple `(a,b,c)` as `((a,b), c)`,
after all. To be able to extract values from our pairs, we'll add the `fst` and
`snd` functions into our language, which accept a tuple and return its
first or second element, respectively.
Our `Expr` data type, which allows ill-typed expressions, ends up as follows:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 31 39 "hl_lines=7 8 9" >}}
I've highlighted the new lines. The additions consist of the `Pair` constructor, which
represents the tuple expression `(a, b)`, and the `Fst` and `Snd` constructors,
which represent the `fst e` and `snd e` expressions, respectively. In
a similar vein, we extend our `SafeExpr` GADT:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 41 49 "hl_lines=7 8 9" >}}
Finally, to provide the `PairType` constructor, we extend the `ExprType` and `repr` functions:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 1 11 "hl_lines=5 11" >}}
### Implementing Equality
An important part of this change is the extension of the `decEq` function,
which compares two types for equality. The kind folks over at `#idris` previously
recommended the use of the `Dec` data type for this purpose. A value of
type `Dec P`
{{< sidenote "right" "decideable-note" "is either a proof that \(P\) is true, or a proof that \(P\) is false." >}}
It's possible that a proposition \(P\) is not provable, and neither is \(\lnot P\).
It is therefore not possible to construct a value of type <code>Dec P</code> for
any proposition <code>P</code>. Having a value of type <code>Dec P</code>, then,
provides us nontrivial information.
{{< /sidenote >}} Our `decEq` function, given two types `a` and `b`, returns
`Dec (a = b)`. Thus, it will return either a proof that `a = b` (which we can
then use to convince the Idris type system that two `SafeExpr` values are,
in fact, of the same type), or a proof of `a = b -> Void` (which tells
us that `a` and `b` are definitely not equal). If you're not sure what the deal with `(=)`
and `Void` is, check out
[this section]({{< relref "" >}}#curry-howard-correspondence)
of the previous article.
A lot of the work in implementing `decEq` went into constructing proofs of falsity.
That is, we needed to explicitly list every case like `decEq IntType BoolType`, and create
a proof that `IntType` cannot equal `BoolType`. However, here's how we use `decEq` in
the typechecking function:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV2.idr" 76 78 >}}
We always throw away the proof inequality! So, rather than spending the time
constructing useless proofs like this, we can just switch `decEq` to return
a `Maybe (a = b)`. The `Just` case will tell us that the two types are equal
(and, as before, provide a proof); the `Nothing` case will tell us that
the two types are _not_ equal, and provide no further information. Let's
see the implementation of `decEq` now:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 13 23 >}}
Lines 14 through 16 are pretty simple; in this case, we can tell at a glance
that the two types are equal, and Idris can infer an equality proof in
the form of `Refl`. We return this proof by writing it in a `Just`.
Line 23 is the catch-all case for any combination of types we didn't handle.
Any combination of types we don't handle is false, and thus, this case
returns `Nothing`.
What about lines 17 through 22? This is the case for handling the equality
of two pair types, `(lt1, lt2)` and `(rt1, rt2)`. The equality of the two
types depends on the equality of their constituents. That is, if we
know that `lt1 = rt1` and `lt2 = rt2`, we know that the two pair types
are also equal. If one of the two equalities doesn't hold, the two
pairs obviously aren't equal, and thus, we should return `Nothing`.
This should remind us of `Maybe`'s monadic nature: we can first compute
`decEq lt1 rt1`, and then, if it succeeds, compute `decEq lt2 rt2`.
If both succeed, we will have in hand the two proofs, `lt1 = rt1`
and `lt2 = rt2`. We achieve this effect using `do`-notation,
storing the sub-proofs into `subEq1` and `subEq2`.
What now? Once again, we have to use `replace`. Recall its type:
replace : {a:_} -> {x:_} -> {y:_} -> {P : a -> Type} -> x = y -> P x -> P y
Given some proposition in terms of `a`, and knowing that `a = b`, `replace`
returns the original proposition, but now in terms of `b`. We know for sure
PairType lt1 lt2 = PairType lt1 lt2
We can start from there. Let's handle one thing at a time, and try
to replace the second `lt1` with `rt1`. Then, we can replace the second
`lt2` with `rt2`, and we'll have our equality!
Easier said than done, though. How do we tell Idris which `lt1`
we want to substitute? After all, of the following are possible:
PairType rt1 lt2 = PairType lt1 lt2 -- First lt1 replaced
PairType lt1 lt2 = PairType rt1 lt2 -- Second lt2 replaced
PairType rt1 lt2 = PairType rt1 lt2 -- Both replaced
The key is in the signature, specifically the expressions `P x` and `P y`.
We can think of `P` as a function, and of `replace` as creating a value
of `P` applied to another argument. Thus, the substitution will occur
exactly where the argument of `P` is used. Then, to achieve each
of the above substitution, we can write `P` as follows:
```Idris {linenos=table, hl_lines=[2]}
t1 => PairType t1 lt2 = PairType lt1 lt2
t1 => PairType lt1 lt2 = PairType t1 lt2
t1 => PairType t1 lt2 = PairType t1 lt2
The highlighted function is the one we'll need to use to attain
the desired result. Since `P` is an implicit argument to `replace`,
we can explicitly provide it with `{P=...}`, leading to the following
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 20 20>}}
We now have a proof of the following proposition:
PairType lt1 lt2 = PairType rt1 lt2
We want to replace the second `lt2` with `rt2`, which means that we
write our `P` as follows:
t2 => PairType lt1 lt2 = PairType rt1 t2
Finally, we perform the second replacement, and return the result:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 21 22 >}}
Great! We have finished implement `decEq`.
### Adjusting The Typechecker
It's time to make our typechecker work with tuples.
First, we need to fix the `IfElse` case to accept `Maybe (a=b)` instead
of `Dec (a=b)`:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 71 78 "hl_lines=7 8" >}}
Note that the only change is from `Dec` to `Maybe`; we didn't need to add new cases
or even to know what sort of types are available in the language.
Next, we can write the cases for the new expressions in our language. We can
start with `Pair`, which, given expressions of types `a` and `b`, creates
an expression of type `(a,b)`. As long as the arguments to `Pair` are well-typed,
so is the `Pair` expression itself; thus, there are no errors to handle.
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 79 83 >}}
The case for `Fst` is more complicated. If the argument to `Fst` is a tuple
of type `(a, b)`, then `Fst` constructs from it an expression
of type `a`. Otherwise, the expression is ill-typed, and we return an error.
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 84 89 >}}
The case for `Snd` is very similar:
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 90 96 >}}
### Evaluation Function and Conclusion
We conclude with our final `eval` and `resultStr` functions,
which now look as follows.
{{< codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 97 111 "hl_lines=7-9 13-15" >}}
As you can see, we require no error handling in `eval`; the expressions returned by
`typecheck` are guaranteed to evaluate to valid Idris values. We have achieved our goal,
with very little changes to `typecheck` other than the addition of new language
constructs. In my opinion, this is a win!
As always, you can see the code on my Git server. Here's
[the latest Idris file,](
if you want to check it out (and maybe verify that it compiles). I hope you found
this interesting!