Add draft of boolean values post.

2020-08-20 21:19:47 -07:00 · 2020-08-20 21:19:47 -07:00 · 6f0667bb28
commit 6f0667bb28
parent 8368283a3e
1 changed files with 220 additions and 0 deletions
--- a/content/blog/boolean_values.md
+++ b/content/blog/boolean_values.md
@ -0,0 +1,220 @@
+---
+title: "How Many Values Does a Boolean Have?"
+date: 2020-08-20T18:37:50-07:00
+draft: ["Java", "Haskell"]
+---
+
+A friend of mine recently had an interview for a software
+engineering position. They later recounted to me the content
+of the techical questions that they had been asked. Some had
+been pretty standard:
+
+* __"What's the difference between concurrency
+and parallelism?"__ -- a reasonable question given that Go was
+the company's language of choice.
+* __"What's the difference between a method and a function?"__ --
+a little more strange, in my opinion, since the difference
+is of little _practical_ use.
+
+But then, they recounted a rather interesting question:
+
+> How many values does a bool have?
+
+Innocous at first, isn't it? Probably a bit simpler, in fact,
+than the questions about methods and functions, concurrency
+and parallelism. It's plausible that a programmer
+has not done much concurrent or parallel programming in their
+life, or that they came from a language in which functions
+were rare and methods were ubiquitous. It's not plausible,
+on the other hand, that a candidate applying to a software
+engineering position has not encountered booleans.
+
+If you're genuinely unsure about the answer to the question,
+I think there's no reason for me to mess with you. The
+simple answer to the question -- as far as I know -- is that a boolean
+has two values. They are `true` and `false` in Java, or `True` and `False`
+in Haskell, and `1` and `0` in C. A boolean value is either true or false.
+
+So, what's there to think about? There are a few things, _ackshually_. 
+Let's explore them, starting from the theoretical perspective.
+
+### What's a Type, Anyway?
+Boolean, or `bool`, is a type. Broadly speaking, a type
+is a property of _something_ that defines what the _something_
+means and what you can do with it. That _something_ can be
+several things; for our purposes, it can either be an
+_expression_ in a programming language (in the form of `fact(n)`)
+or a value in that same programming langauge (like `5`).
+
+Dealing with values is rather simple. Most languages have finite numbers,
+usually with \\(2^{32}\\) values, which have type `int`,
+`i32`, or something in a similar vein. Most languages also have
+strings, of which there are as many as you have memory to contain,
+and which have the type `string`, `String`, or occasianlly
+the more confusing `char*`. Most languages also have booleans,
+as we discussed above.
+
+The deal with expressions is a more interesting. Presumably
+expressions evaluate to values, and the type of an expression
+is then the type of values it can yield. Consider the following
+snippet in C++:
+
+```C
+int square(int x) {
+    return x * x;
+}
+```
+
+Here, the expression `x` is known to have type `int` from
+the type signature provided by the user. Multiplication
+of integers yields an integer, and so the type of `x*x` is also
+of type `int`. Since `square(x)` returns `x*x`, it is also 
+of type `int`. So far, so good.
+
+Okay, how about this:
+
+```C++
+int meaningOfLife() {
+    return meaningOfLife();
+}
+```
+
+No, wait, doesn't say "stack overflow" just yet. That's no fun.
+And anyway, this is technically a tail call, so maybe our
+C++ compiler can avoid growing the stack And indeed,
+flicking on the `-O2` flag in this [compiler explorer example](https://godbolt.org/z/9cv4nY),
+we can see that no stack growth is necessary: it's just
+an infinite loop. But `meaningOfLife` will never return a value. One could say
+this computation _diverges_.
+
+Well, if it diverges, just throw the expression out of the window! That's
+no `int`! We only want _real_ `int`s!
+
+And here, we can do that. But what about the following:
+
+```C++
+inf_int collatz(inf_int x) {
+    if(x == 1) return 1;
+    if(x % 2 == 0) return collatz(x/2);
+    return collatz(x * 3 + 1);
+}
+```
+
+Notice that I've used the fictitious type
+`inf_int` to represent integers that can hold
+arbitrarily large integer values, not just the 32-bit ones.
+That is important for this example, and I'll explain why shortly.
+
+The code in the example is a simulation of the process described
+in the [Collatz conjecture](https://en.wikipedia.org/wiki/Collatz_conjecture).
+Given an input number `x`, if the number is even, it's divided in half,
+and the process continues with the halved number. If, on the other
+hand, the number is odd, it's multiplied by 3, 1 is added to it,
+and the process continues with _that_ number. The only way for the
+process to terminate is for the computation to reach the value 1.
+
+Why does this matter? Because as of right now, __nobody knows__
+whether or not the process terminates for all possible input numbers.
+We have a strong hunch that it does; we've checked a __lot__
+of numbers and found that the process terminates for them.
+This is why 32-bit integers are not truly sufficient for this example;
+we know empirically that the function will terminate for them.
+
+But why does _this_ matter? Well, it matters because we don't know
+whether or not this function will diverge, and thus, we can't
+'throw it out of the window' like we wanted to with `meaningOfLife`!
+In general, it's _impossible to tell_ whether or not a program will
+terminate; that is the [halting prorblem](https://en.wikipedia.org/wiki/Halting_problem).
+So, what do we do?
+
+It turns out to be convenient -- formally -- to treat the result of a diverging computation
+as its own value. This value is usually called 'bottom', and written as \\(\\bot\\).
+Since in most programming languages, you can write a nonterminating expression or
+function of any type, this 'bottom' is included in _all_ types. So in fact, the
+set of possible values for `unsigned int`: \\(\\bot, 0, 1, 2, ...\\) and so on.
+As you may have by now guessed, the same is true for a boolean: we have \\(\\bot\\), `true`, and `false`.
+
+### Haskell and Bottom
+You may be thinking:
+
+> Now he's done it; he's gone off the deep end with all that programming language
+theory. Tell me, Daniel, where the heck have you ever encountered \\(\\bot\\) in
+code? This question was for a software engineering interview, after all!
+
+You're right; I haven't _specifically_ seen the symbol \\(\\bot\\) in my time
+programming. But I have frequently used an equivalent notation for the same idea:
+`undefined`. In fact, here's a possible definition of `undefined` in Haskell:
+
+```
+undefined = undefined
+```
+
+Just like `meaningOfLife`, this is a divergent computation! What's more is that
+the type of this computation is, in Haskell, `a`. More explicitly -- and retreating
+to more mathematical notation -- we can write this type as: \\(\\forall \\alpha . \\alpha\\).
+That is, for any type \\(\\alpha\\), `undefined` has that type! This means
+`undefined` can take on _any_ type, and so, we can write:
+
+```Haskell
+myTrue :: Bool
+myTrue = True
+
+myFalse :: Bool
+myFalse = False
+
+myBool :: Bool
+myBool = undefined
+```
+
+In Haskell, this is quite useful. For instance, if one's in the middle
+of writing a complicated function, and wants to check their work so far,
+they can put 'undefined' for the part of the function they haven't written. 
+They can then compile their program; the typechecker will find any mistakes
+they've made so far, but, since the type of `undefined` can be _anything_,
+that part of the program will be accepted without second thought.
+
+The language `Idris` extends this practice with the idea of typed holes,
+where you can leave fragments of your program unwritten, and ask the
+compiler what kind of _thing_ you need to write to fill that hole.
+
+### Java and `null`
+Now you may be thinking:
+
+> This whole deal with Haskell's `undefined` is beside the point; it doesn't
+really count as a value, since it's just a nonterminating
+expression. What you're doing is a kind of academic autofellatio.
+
+Alright, I can accept this criticism. Perhaps just calling a nonterminating
+function a value _is_ far-fetched (even though denotational semantics
+_do_ extend types with \\(\\bot\\)). But denotational semantics is not
+the only place where types are implcitily extend with an extra value;
+let's look at Java.
+
+In Java, we have `null`. At the
+core language level, any function or method that accepts a class can also take `null`;
+if `null` is not to that function or method's liking, it has to 
+explicitly check for it using `if(x == null)`. 
+
+Java's booleans are not, at first glance, classes. Unlike classes, which you have
+to allocate using `new`, you can just throw around `true` and `false` as you see
+fit. Also unlike classes, you can't assign `null` to a boolean value.
+The trouble is, the _generics_ part of Java, which allows you to write
+polymorphic functions, can't handle 'primitives' like `bool`. If you want to have an `ArrayList`
+of something, that something _must_ be a class.
+
+But what if you really _do_ want an `ArrayList` of booleans? Java solves this problem by introducing
+'boxed' booleans: they're primitives wrapped in a class, called `Boolean`. This class
+can then be used for generics.
+
+But see, this is where `null` has snuck in again. By allowing `Boolean` to be a class
+(thereby granting it access to generics), we've also given it the ability to be null.
+This example is made especially compelling because Java supports something
+they call [autoboxing](https://docs.oracle.com/javase/tutorial/java/data/autoboxing.html):
+you can directly assign a primitive to a variable of the corresponding boxed type. 
+Consider the example:
+
+```Java
+Boolean myTrue = true;
+Boolean myFalse = false;
+Boolean myBool = null;
+```