Homework/HW4.md

12 KiB

Part 1: Solution Discussion

Benjamin's Solution

Looks like Ben opted to literally translate the Prog nonterminal into Haskell. This is perfectly valid: if anything, it's a more faithful translation of the syntax. However, as you probably know, the definition is isomorphic to a list:

data Prog = Empty | Stmt Stmt Prog
data List Stmt = Nil | Cons Stmt (List Stmt)

So, you could've defined Prog to be [Stmt]. There are a few small advantages to this, mostly in terms of using Haskell as a metalanguage. Here's a line from my interpreter for this language:

stmt (Loop p) = prog (cycle p) `catchError` (const $ return ())

In order to implement the looping behavior of Loop, I used Haskell's built-in cycle :: [a] -> [a] function. This function infinitely repeats a list:

cycle [1,2,3] = [1,2,3,1,2,3,1,2,3,...]

The more compact list notation is another advantage to this approach.

I appreciated the following comment in the code:

- Starting register A at x, set R to zero, and add to R until A passes y (my approach above)
- Same as above, but use y-x as the boundary instead of y, and adjust accordingly
- Don't use a do loop, and instead hardcode out y-x additions of x+1 each time

What about a closed form approach? Something like (b-a+1)(b+a)/2. If * and / were O(1) operations, this would make the whole summation O(1).

Owen's Solution

Looks like Owen had some trouble with this homework assignment. From what I can tell, the code should not compile; there are a few things here that can be improved.

Allowing Expr to be Int

I'm looking at this line of code:

-- since there is already an int type in Haskell, I think I can consider it already encoded
data Expr = Int

Unfortunately, this isn't quite right. This creates a constructor (value!) Int of type Expr. This value Int has nothing to do with the type Int (integers). Thus, Expr can't contain integers like you intended. The proper syntax would be something like:

data Expr = Lit Int 

Unlike the previous version, this will create a constructor called Lit with type Int -> Expr. This time, the Int refers to the type Int, and so, Lit 3 will represent an expression that contains the number 3.

Defining Reg

Haskell's type is a way to create a type alias. So, if you have a type you use a lot (perhaps Maybe String), you can give it another name:

type MyType = Maybe String

Then, Maybe String and MyType mean the same thing. For defining Reg, you want to use data:

data Reg = A | B | C

Using undefined or _holes

It's hard to get the whole program compiling in one go. Haskell supports placeholders, which can be used anywhere, and will compile (mostly) no matter what. Of course, they'll either crash at runtime (undefined) or not really compile (_holes). However, if you have something like:

while = -- can't quite do this yet

sumFromTo = ... -- but now there's a parse error!

You can use undefined:

while = undefined

This way, there's no parse error, and GHC goes on to check the rest of your program. Holes are useful when you're in the middle of a complicated expression:

foldr _help 1 [1,2,3]

This will almost compile, except that at the very end, Haskell will tell you the type of the expression you need to fill in for _help. It will also suggest what you can put for _help!

Parenthesization

This is a small nitpick. Instead of writing RegStore A (x), you would write RegStore A x. In this case, you probably wanted RegStore A (Lit x) (you do need parentheses here!).

Making Type Errors Impossible

I'm looking at this part of Part 2:

data Stmt = RegStore Reg IntExpr
	| RegStore Reg BoolExpr
	| Contitional IntExpr Prog Prog
	| Contitional BoolExpr Prog Prog
	| Loop Prog
	| BreakLoop

You were on the right track!

I think this should make it so that I have the full functionality of the previous setting but now I have one version for each type of expr?

You're exactly right. You can represent any program from Part 1 using this syntax. However, the point was: you don't need to! Programs in the form Conditional IntExpr Prog Prog are specifcally broken! You can't use a number as a condition for if/then/else, and IntExpr is guaranteed to produce a number! The same is true for RegStore Reg BoolExpr: you aren't allowed to store booleans into registers! See this part of the homework description:

Note that although the expression language may produce both booleans and integers, the registers may only contain integers.

So, it is in fact sufficient to leave out the duplicate constructors you have:

data Stmt = RegStore Reg IntExpr
	| Contitional BoolExpr Prog Prog
	| Loop Prog
	| BreakLoop

Hope all this helps!

Jaspal's Solution

Notably different in Jaspal's solution from the others in this group is that they used strings for registers instead of a data type definition. I think this was allowed by this homework assignment; however, I think that using a data type is better for our purposes.

The issue is, if your register is a string, it can be any string. You can have register "A", register "a", register "Hello World", and so on. But our language only has three registers: A, B, and R. If we define a data type as follows:

data Reg = A | B | R

Then a value of type Reg can only be either A, B, or R. This means that we can't represent programs like

Hello := 5

Which is good!

This solution also uses variables to encode programs. I think that's a nice touch! It makes some expressions less bulky than they otherwise would be.

It doesn't look like the code compiles; here are a few things you can do to bring it closer to a working state.

Type Names and Constructor Names

Type names (Expr, Int, Bool) all have to start with an uppercase letter. Thus, you should change the following line:

data expr

To the following:

data Expr

And so on.

The exception to this rule is type variables for when things are polymorphic. For instance, we have

data Maybe a = Nothing | Just a

Here, a is lowercase, but tht's because a is actually a placeholder for an arbitrary type. You can tell because it also occurs on the left side of the = : data Maybe a.

Constructor names also have to start with an uppercase letter. So, you would have:

data Stmt
  = Eql Reg Expr 
  | IfElse Expr Prog Prog 
  | Do Prog 
  | Break 

String Notation

I have made a case that you shouldn't be using strings for this homework. However, you'll probably use them in the future. So, I want to point out: 'A' does not mean the string "A". It means a single character, like 'A' in C. To create a String in Haskell, you want to use double quotes: "A".

Other than those two things, it looks like this should work!

My Solution

My solution to Part 1 is pretty much identical to everyone else's; I opted to use Prog = [Stmt] instead of defining a data type, but other than that, not much is different.

I had some fun in Part 2, though. In particular, instead of creating two separate data types, BExpr and IExpr, I used a Generalized Algebraic Data Type (GADT). Thus, my code was as follows:

data Expr a where
    Lit :: Int -> Expr Int
    Reg :: Reg -> Expr Int
    Plus :: Expr Int -> Expr Int -> Expr Int
    Leq :: Expr Int -> Expr Int -> Expr Bool
    Not :: Expr Bool -> Expr Bool

This defines an Expr type that's parameterized by a type a. I defined a to be "the type the expression evaluates to". So,

  • Expr Int is an expression that is guaranteed to produce an integer
  • Expr Bool is an expression that is guaranteed to produce a boolean.

Unlike regular ADTs, you can restrict how one can create instances of Expr a. Thus, I can make it so that it's only possible to create Expr Int using Lit and Reg; the constructors Plus, Leq, and Not produce Expr Bool.

Why in the world would you use this? The benefits aren't immediately obvious in this language; however, suppose you had a ternary expression like a ? b : c in C or C++. Using our BExpr and IExpr we'd have to add constructors to both:

data IExpr
    = -- ...
    | ITern BExpr IExpr IExpr
    | -- ...
    
data BExpr
    = -- ...
    | BTern BExpr BExpr BExpr
    | -- ...

However, using the GADT definition, you should just be able to write:

data Expr a where
    -- ...
    Tern :: Expr Bool -> Expr b -> Expr b -> Expr b

Which covers both x ? 1 : 2 and x ? true : false, but doesn't allow x ? true : 3 (which would be a type error). Using a GADT, the "bexpr" part of the abstract syntax maps to Expr Bool, and the "iexpr" part maps to Expr Int.

I also wrote a quick interpreter using the Monad Transformer Library. Monads are one of the classic "what the heck is this" things in Haskell; someone once said there are more monad tutorials than Haskell programmers out there. I don't think I'm qualified to give a good explanation, but in short: I used a state monad (which is basically a Haskell type that helps you keep track of changing variables) and an exception monad (which implements exception handling similar to Java's throw) to implement the language. Every Loop was implicitly a catch clause, and Break was a throw. Thus, when you reached a break, you would immediately return to the innermost loop, and continue from there. Combined with a GADT for expressions, the evaluator turned out really compact: less than 25 lines of actual code!

Part 2: Questions

Q: What happens if you encode those programs using the data types you defined in Part 1?

A: Well, not much. The program is accepted, even though it's not type-correct. Haskell doesn't have any issues with it; however, were we to write an interpreter, we would quickly notice that although Haskell accepts the program, it's aware of the possibility of invalid cases; our interpreter would need to account for the possibility of expressions evaluating both to integers and to booleans.

Q: What happens if you encode them using (a correct version of) the data types you defined in Part 2?

A: Haskell rejects the program. We've "made illegal states unrepresentable" by one way or another teaching the Haskell type systems that integer and boolean expressions can't arbitrarily mix. An interpreter would not require error handling, since type errors were eliminated using the metalanguage's type system.

Q: What does this tell you about the relationship between errors in your object language and errors in the metalanguage of Haskell?

A: The way I see it, a metalanguage is merely a toolbox for working with our object language. By specifying our data types in a different way, we are effectively configuring the toolbox to work differently. In this specific case, we've given this "toolbox" a primitive understanding of our language's semantics, and it is able to automatically reason about basic "correctness" properties of programs. In terms of the two types of errors, we've shifted the errors from the object language into the meta language.

Q: Can type errors always be eliminated from a language by refactoring the syntax in this way? Provide a brief rationale for your answer.

A: Not in general, no. Suppose we allowed registers to include booleans, too. Type errors could still occur at runtime. Perhaps something like the following:

try {
    someFunction();
    A = true;
} catch(error) {
    A = 1
}
if A then B = 1 else B = 2 end

Determining whether or not a function terminates in an error state, is, as far as I know, undecidable. Thus, if our type system could reject something like this, it would be quite an impressive system!

I think the general thought is: if our type system is static, it's likely possible to include all the necessary checks into the abstract syntax. Perhaps not in Haskell, but in a language like Idris or Coq. However, as soon as we make our types dynamic, it becomes impossible to accept precisely all valid programs in a language, and nothing more.

Although.... some type systems are in themselves undecideable. So maybe even for a static type system, things may not be as easy.

TL;DR: No.