Add my initial slides

Signed-off-by: Danila Fedorin <daniel.fedorin@hpe.com>
2025-10-02 17:52:11 -07:00 · 2025-10-02 17:52:11 -07:00 · 59fea61d2d
commit 59fea61d2d
6 changed files with 1721 additions and 0 deletions
--- a/alloy/abstract.txt
+++ b/alloy/abstract.txt
@ -0,0 +1,22 @@
 Formal methods are a set of techniques that are used to validate the correctness
 of software. A particular category of these methods, model checking, uses the
 mathematical language of temporal logic to construct specifications of software’s
 behavior. A solver can then validate the constraints described in the formal
 language and ensure that undesirable states do not occur.
 This talk will be an experience report of using formal methods, specifically
 the Alloy analyzer, to detect a bug in Chapel’s ‘Dyno’ compiler front-end library.
 The area in which the bug was discovered is currently used in production, as
 well as a part of editor tools such as chplcheck and chpl-language-server.
 Specifically, Alloy was used to construct a formal specification of a part of
 Chapel’s use/import lookup algorithm. Chapel has a number of complicated scoping
 rules and possible edge cases in this area. By running this specification against
 a solver, a sequence of steps was discovered that could cause the algorithm to
 malfunction and produce incorrect results. A program that causes these steps to
 occur was constructed and served as a concrete reproducer for the bug. This
 reproducer was used to adjust the logic and fix the bug.
 This talk will cover the fundamentals of temporal logic required for formal
 specifications, the necessary parts of Chapel’s use/import lookup algorithm,
 and the steps taken to encode and validate the compiler’s behavior.
--- a/alloy/bug.png
+++ b/alloy/bug.png
--- a/alloy/instancefound.png
+++ b/alloy/instancefound.png
--- a/alloy/slides.md
+++ b/alloy/slides.md
@ -0,0 +1,688 @@
 ---
 theme: gaia
 title: Using Formal Methods to Discover a Bug in the Chapel Compiler
 backgroundColor: #fff
 ---
 <!-- _class: lead -->
 <style>
 section {
    font-size: 25px;
 }
 blockquote {
    padding: 15px;
    background-color: #f0f0ff;
 }
 div.side-by-side {
    display: flex;
    justify-content: space-between;
    gap: 20px;
 }
 div.side-by-side > div {
    flex: 1;
 }
 </style>
 # <!--fit--> **Using Formal Methods to Discover a Bug in the Chapel Compiler**
 Daniel Fedorin, HPE
 ---
 # Terms
 * What are **formal methods**?
    * Techniques rooted in computer science and mathematics to specify and verify systems
 * What part of the **Chapel compiler**?
    * The 'Dyno' compiler front-end, particularly its use/import resolution phase.
    * This piece is used by the production Chapel compiler.
 ---
 # The Story
 I have a story in three parts.
 1. I found it very hard to think through a section of compiler code.
    - Specifically, code that performed lookups in `use`s/`import`s
 2. I used the [Alloy Analyzer](https://alloytools.org/) to model the assumptions and behavior of the code.
    - I had little background in Alloy, but some background in (formal) logic
 3. This led me to discover a bug in the compiler that I then fixed.
    - Re-creating the bug required some gymanstics that were unlikely to occur in practice.
 ---
 <!-- _class: lead -->
 # Background: Chapel's Scope Lookups
 ---
 # The Humble `foo`
 Imagine you see the following snippet of Chapel code:
 ```chapel
 foo();
 ```
 Where do you look for `foo`? The answer is quite complicated, and depends
 strongly on the context of the call.
 Moreover, the order of where to look matters: method calls are preferred
 over global functions, "nearer" functions are preferred over "farther" ones.
 ---
 # The Humble `foo` (example 1)
 ```chapel
 module M1 {
  record R {
    proc foo() { writeln("R.foo"); }
  }
  proc foo() { writeln("M1.foo"); }
  foo(); // which 'foo'?
 }
 ```
 Here, things are pretty straightforward: we look in scope `M1`. `R.foo` is not
 in it, but `M1.foo` is. We return it.
 ---
 # The Humble `foo` (example 2)
 ```chapel
 module M1 {
  record R {}
  proc R.foo() { writeln("R.foo"); }
  proc foo() { writeln("M1.foo"); }
  foo(); // which 'foo'?
 }
 ```
 Here, things are bit trickier. Since we are not inside a method, we know
 that `foo()` could not be a call to a method. Thus, we rule out `R.foo`,
 and find `M1.foo` in `M1`.
 ---
 # The Humble `foo` (example 3)
 ```chapel
 module M1 {
  record R {}
  proc R.foo() { writeln("R.foo"); }
  proc foo() { writeln("M1.foo"); }
  proc R.someMethod() {
    foo(); // which 'foo'?
  }
 }
 ```
 * Both `R.foo` and `M1.foo` would be valid candidates.
 * We give priority to methods over global functions. So, the compiler would:
    * Search `R` and its scope (`M1`) for methods named `foo`.
    * If that fails, search `M1` for any symbols `foo`.
 * We've had to look at `M1` twice! (once for methods, once for non-methods)
 ---
 # The Humble `foo` (example 4)
 ```chapel
 module M1 {
  record R {}
  proc foo() { writeln("R.foo"); }
 }
 module M2 {
  use M1;
  proc R.someMethod() {
    foo(); // which 'foo'?
  }
 }
 ```
 Here, we search the scope of `R` and `M1`, but **only for public symbols**.
 ---
 # How Chapel's Compiler Handles This
 We want to:
 - Respect the priority order
  - Including prefering methods over non-methods
  - As a result, we search the scopes multiple times
 - Avoid any extra work
  - This includes redundant re-searches
  - Example redundant search: looked up "all symbols", then later "all public symbols"
 ```C++
 enum { PUBLIC = 1, NOT_PUBLIC = 2, METHOD_FIELD = 4, NOT_METHOD_FIELD = 8, /* ... */ };
 ```
 1. For each scope, save the flags we've already searched with
 2. When searching a scope again, exclude the flags we've already searched with
 This was handled by two bitfields: `filter` and `excludeFilter`.
 ---
 # Populating `filter` and `excludeFilter`
 ```C++
 if (skipPrivateVisibilities) { // depends on context of 'foo()'
  filter |= IdAndFlags::PUBLIC;
 }
 if (onlyMethodsFields) {
  filter |= IdAndFlags::METHOD_FIELD;
 } else if (!includeMethods && receiverScopes.empty()) {
  filter |= IdAndFlags::NOT_METHOD;
 }
 ```
 ```C++
 excludeFilter = previousFilter;
 ```
 ```C++
 // scary!
 previousFilter = filter & previousFilter;
 ```
 Code notes `previousFilter` is an approximation.
 ---
 # Possible Problems
 * `previousFilter` is an approximation.
    * For `previousFilter = PUBLIC` and `filter = METHOD_FIELD`, we get
      `previousFilter = 0`, indicating no more searches should be donne.
    * But we're missing private non-methods!
 * However, no case we knew of hit this combination of searches, or any like it.
    * All of our language tests passed.
    * Code seemed to work.
 * If only there was a way I could get a computer to check whether such a combination
  could occur...
 ---
 <!-- _class: lead -->
 # Formal Methods
 ---
 # Types of Formal Methods
 - Model checking involves formally describing the behavior of a system, then having a solver check whether other desired properties hold.
    - Alloy is an example of a model checker.
    - TLA is another famous example.
 - Theorem proving is a heavier weight approach that involves building a formal proof of correctness.
    - Coq and Isabelle are examples of theorem provers.
 ---
 <style scoped>
 li:nth-child(2) { color: lightgrey; }
 </style>
 # Types of Formal Methods
 - Model checking involves formally describing the behavior of a system, then having a solver check whether other desired properties hold.
    - Alloy is an example of a model checker.
    - TLA is another famous example.
 - Theorem proving is a heavier weight approach that involves building a formal proof of correctness.
    - Coq and Isabelle are examples of theorem provers.
 **Reason**: I was in the middle of developing compiler code. I wanted to sketch
 the assumptions I was making and see if they held up.
 ---
 # A Primer on Logic
 Model checkers like Alloy are rooted in temporal logic, which builds on
 first-order logic. This includes:
 - Variables (e.g. $x$, $y$)
    - These represent any objects in the logical system.
 - Predicates (e.g. $P(x)$, $Q(x,y)$)
    - These represent properties of objects, or relationships between objects.
 - Logical connectives (e.g. $\land$, $\lor$, $\neg$, $\Rightarrow$)
    - These are used to combine predicates into more complex statements.
    - $\land$ is "and", $\lor$ is "or", $\neg$ is "not", $\Rightarrow$ is "implies".
 - Quantifiers (e.g. $\forall$, $\exists$)
    - $\forall x. P(x)$ (in Alloy: `all x { P(x) }`) means "for all $x$, $P(x)$ is true".
    - $\exists x. P(x)$ (in Alloy: `some x { P(x)}`) means "there exists an $x$ such that $P(x)$ is true".
 ---
 # A Primer on Logic (example)
 Example statement: "Bob has a son who likes all compilers".
 $$
 \exists x. (\text{Son}(x, \text{Bob}) \land \forall y. (\text{Compiler}(y) \Rightarrow \text{Likes}(x, y)))
 $$
 In Alloy:
 ```alloy
 some x { Son[x, Bob] and all y { Compiler[y] implies Likes[x, y] } }
 ```
 ---
 # A Primer on Temporal Logic
 Temporal logic provides additional operators to reason about how properties change over time.
 - $\Box p$ (in Alloy: `always p`): A statement that is always true.
 - $\Diamond p$ (in Alloy: `eventually p`) : A statement that will be true at some point in the future.
 In Alloy specifically, we can mention the next state of a variable using `'`.
 ```alloy
 // pseudocode:
 //   the next future value of previousFilter will be the intersection of filter
 //   and the current value
 previousFilter' = filter & previousFilter;
 ```
 ---
 # A Primer on Temporal Logic (example)
 Some examples:
 $$
 \Box(\text{like charges repel})
 $$
 $$
 \Diamond(\text{the sun is in the sky})
 $$
 In Alloy:
 ```alloy
 // likeChargesRepel and theSunIsInTheSky are predicates defined elsewhere
 always likeChargesRepel
 // good thing it's not `always`
 eventually theSunIsInTheSky
 ```
 ---
 # Search Configuration in Alloy
 Instead of duplcating `METHOD` and `NOT_METHOD`, use two sets of flags (the regular and the "not").
 ```alloy
 enum Flag {Method, MethodOrField, Public}
 sig Bitfield {
    , positiveFlags: set Flag
    , negativeFlags: set Flag
 }
 sig FilterState { , curFilter: Bitfield }
 ```
 __Takeaway__: We represent the search flags as a `Bitfield`, which encodes `PUBLIC`, `NOT_PUBLIC`, etc.
 ---
 # Constructing Bitfields
 Alloy doesn't allow us to define functions that somehow combine bitfields. We might want to write:
 ```chapel
 proc addFlag(b: Bitfield, flag: Flag): Bitfield {
    return Bitfield(b.positiveFlags + flag, b.negativeFlags);
 }
 ```
 Instead, we can _relate_ two bitfields using a predicate.
 > This bitfield is like that bitfield, but with this flag added.
 ```chapel
 pred addBitfieldFlag[b1: Bitfield, b2: Bitfield, flag: Flag] {
    b2.positiveFlags = b1.positiveFlags + flag
    b2.negativeFlags = b1.negativeFlags
 }
 ```
 ---
 # Constructing Bitfields
 Alloy doesn't allow us to define functions that somehow combine bitfields. We might want to write:
 ```chapel
 proc addFlag(b: Bitfield, flag: Flag): Bitfield {
    return Bitfield(b.positiveFlags + flag, b.negativeFlags);
 }
 ```
 Instead, we can _relate_ two bitfields using a predicate.
 > This bitfield is exactly like that bitfield.
 ```chapel
 pred bitfieldEqual[b1: Bitfield, b2: Bitfield] {
    b1.positiveFlags = b2.positiveFlags and b1.negativeFlags = b2.negativeFlags
 }
 ```
 ---
 # Modeling Possible Searches
 Alloy isn't an imperative language. We can't mutate variables like we do in C++. Instead, we model how each statement changes the state, by relating the "current" state to the "next" state.
 <div class="side-by-side">
  <div>
 ```C++
  filter |= IdAndFlags::PUBLIC;
 ```
  </div>
  <div>
 ```alloy
  addBitfieldFlag[filterNow, filterNext, Public]
 ```
  </div>
 </div>
 This might remind you of [Hoare Logic](https://en.wikipedia.org/wiki/Hoare_logic), where statements like:
 $$
 \{ P \} \; s \; \{ Q \}
 $$
 Read as:
 > If $P$ is true before executing $s$, then $Q$ will be true after executing $s$.
 $$
 \{ \text{filter} = \text{filterNow} \} \; \texttt{filter |= PUBLIC} \; \{ \text{filter} = \text{filterNext} \}
 $$
 ---
 # Modeling Possible Searches
 To combine several statements, we make it so that the "next" state of one statement is the "current" state of the next statement.
 <div class="side-by-side">
  <div>
 ```C++
  curFilter |= IdAndFlags::PUBLIC;
  curFilter |= IdAndFlags::METHOD_FIELD;
 ```
  </div>
  <div>
 ```alloy
  addBitfieldFlag[filterNow, filterNext1, Public]
  addBitfieldFlag[filterNext1, filterNext2, Method]
 ```
  </div>
 </div>
 This is reminiscent of sequencing hoare triples:
 $$
 \{ P \} \; s_1 \; \{ Q \} \; s_2 \; \{ R \}
 $$
 ---
 # Modeling Possible Searches
 Finally, if C++ code has conditionals, we need to allow for the possibility of either branch being taken. We do this by using "or" on descriptions of the next state.
 <div class="side-by-side">
  <div>
 ```C++
  if (skipPrivateVisibilities) {
    curFilter |= IdAndFlags::PUBLIC;
  }
  if (onlyMethodsFields) {
    curFilter |= IdAndFlags::METHOD_FIELD;
  } else if (!includeMethods && receiverScopes.empty()) {
    curFilter |= IdAndFlags::NOT_METHOD;
  }
 ```
  </div>
  <div>
 ```alloy
  addBitfieldFlag[initialState.curFilter, bitfieldMiddle, Public] or
  bitfieldEqual[initialState.curFilter, bitfieldMiddle]
  // If it's a method receiver, add method or field restriction
  addBitfieldFlag[bitfieldMiddle, filterState.curFilter, MethodOrField] or
  // if it's not a receiver, filter to non-methods (could be overridden)
  addBitfieldFlagNeg[bitfieldMiddle, filterState.curFilter, Method] or
  // Maybe methods are not being curFilterd but it's not a receiver, so no change.
  bitfieldEqual[bitfieldMiddle, filterState.curFilter]
 ```
  </div>
 </div>
 Putting this into a predicate, `possibleState`, we encode what searches the compiler can undertake.
 **Takeaway**: We encoded the logic that configures possible searches in Alloy. This instructs the analyzer about possible cases to consider.
 ---
 # Modeling `previousFilter`
 So far, all we've done is encoded what queries the compiler might make about a scope.
 We still need to encode how we save the flags we've already searched with.
 Model the search state with a "global" (really, unique) variable:
 ```alloy
 /* Initially, no search has happeneed for a scope, so its 'previousFilter' is not set to anything. */
 one sig NotSet {}
 one sig SearchState {
    , var previousFilter: Bitfield + NotSet
 }
 ```
 Above, `+` is used for union. `previousFilter` can either be a `Bitfield` or `NotSet`.
 ---
 # Modeling `previousFilter`
 If no previous search has happened, we set `previousFilter` to the current `filter`.
 Otherwise, we set `previousFilter` to the intersection of `filter` and `previousFilter`, as mentioned before.
 <div class="side-by-side">
  <div>
 ```C++
  if (hasPrevious) {
    previousFilter = filter & previousFilter;
  } else {
    previousFilter = filter;
  }
 ```
  </div>
  <div>
 ```alloy
  pred update[toSet: Bitfield + NotSet, setTo: FilterState] {
 	toSet' != NotSet and bitfieldIntersection[toSet, setTo.include, toSet']
  }
  pred updateOrSet[toSet: Bitfield + NotSet, setTo: FilterState] {
      (toSet = NotSet and toSet' = setTo.include) or
      (toSet != NotSet and update[toSet, setTo])
  }
 ```
  </div>
 </div>
 ---
 # Putting it Together
 We now have a model of what our C++ program is doing: it computes some set of filter flags, then runs a search, excluding the previous flags. It then updates the previous flags with the current search. We can encode
 this as follows:
 ```
 fact step {
    always {
        // Model that a new doLookupInScope could've occurred, with any combination of flags.
        all searchState: SearchState {
            some fs: FilterState {
                // This is a possible combination of lookup flags
                possibleState[fs]
                // If a search has been performed before, take the intersection; otherwise,
                // just insert the current filter flags.
                updateOrSet[searchState.previousFilter, fs]
            }
        }
    }
 }
 ```
 ---
 # Are there any bugs?
 Model checkers ensure that all properties we want to hold, do hold. To find a counter example, we ask it to prove the negation of what we want.
 ```C
 wontFindNeeded: run {
    all searchState: SearchState {
        eventually some props: Props, fs: FilterState, fsBroken: FilterState {
            // Some search (fs) will cause a transition / modification of the search state...
            configureState[fs]
            updateOrSet[searchState.previousFilter, fs]
            // Such that a later, valid search... (fsBroken)
            configureState[fsBroken]
            // Will allow for a set of properties...
            // ... that are left out of the original search...
            not bitfieldMatchesProperties[searchState.previousFilter, props]
            // ... and out of the current search
            not (bitfieldMatchesProperties[fs.include, props] and not bitfieldMatchesProperties[searchState.previousFilter, props])
            // But would be matched by the broken search...
            bitfieldMatchesProperties[fsBroken.include, props]
            // ... to not be matched by a search with the new state:
            not (bitfieldMatchesProperties[fsBroken.include, props] and not bitfieldOrNotSetMatchesProperties[searchState.previousFilter', props])
        }
    }
 }
 ```
 ---
 # Uh-Oh!
 ![](./instancefound.png)
 ---
 # The Bug
 ![bg left width:600px](./bug.png)
 We need some gymnastics to figure out what varibles make this model possible.
 Alloy has a nice visualizer, but it has a lot of information.
 In the interest of time, I found:
 * If the compiler searches a scope firt for `PUBLIC` symbols, ...
 * ...then for `METHOD_OR_FIELD`, ...
 * ...then for any symbols, they will miss things!
 ---
 <style scoped>
 section {
    font-size: 20px;
 }
 </style>
 # The Reproducer
 To trigger this sequence of searches, we needed a lot more gymnastics.
 <div class="side-by-side">
  <div>
 ```chapel
 module TopLevel {
  module XContainerUser {
    public use TopLevel.XContainer;
  }
  module XContainer {
    private var x: int;
    record R {}
    module MethodHaver {
      use TopLevel.XContainerUser;
      use TopLevel.XContainer;
      proc R.foo() {
        var y = x;
      }
    }
  }
 }
 ```
  </div>
  <div>
 * the scope of `R` is searched with for methods
 * The scope of `R`’s parent (`XContainer`) is searched for methods
 * The scope of `XContainerUser` is searched for public symbols (via the `use`)
 * The scope of `XContainer` is searched with public symbols (via the `public use`)
 * The scope of `XContainer` searched for with no filters via the second use; but the stored filter is bad, so the lookup returns early, not finding `x`.
  </div>
 </div>
--- a/type-level/abstract.txt
+++ b/type-level/abstract.txt
@ -0,0 +1,32 @@
 Chapel’s type system can be surprisingly powerful. In addition to “usual”
 features such as generics and polymorphism, Chapel provides the ability to
 manipulate types using functions; this involves both taking types as arguments
 to functions and returning them from these functions. This can enable powerful
 programming techniques that are typically confined to the domain of
 metaprogramming.
 For example, although Chapel’s notion of compile-time values — ‘param’s — is
 limited to primitive types such as integers, booleans, and strings, one can
 encode compile-time lists of these values as types. Such encodings can be used
 to create compile-time specializations of functions that would be otherwise
 tedious to write by hand. One use case for such specializations is the
 implementation of a family of functions for approximating differential
 equations, the Adams-Bashforth methods. Some variants of these methods can be
 encoded as lists of coefficients. Thus, it becomes possible to define a single
 function that accepts a type-level list of coefficients and produces a
 “stamped out” implementation of the corresponding method. This reduces the
 need to implement each method explicitly by hand. Another use case of function
 specialization is a type-safe ‘printf’ function that validates that users’
 format specifiers match the type of arguments to the function.
 More generally, Chapel’s types can be used to encode algebraic sums (disjoint
 unions) and products (Cartesian) of types. This, in turn, makes it possible to
 build arbitrary data structures at the type level. The lists-of-values case
 above is an instance of this general principle. Functions can be defined on
 type-level data structures by relying on overloading and type arguments.
 Programming in this manner starts to resemble programming in a purely
 functional language such as Haskell.
 Though this style of programming has not seen much use thus far, it can be a
 powerful technique for controlling types of arguments or constructing highly
 customized functions with no runtime overhead.
--- a/type-level/slides.md
+++ b/type-level/slides.md
@ -0,0 +1,979 @@
 ---
 theme: gaia
 title: Type-Level Programming in Chapel for Compile-Time Specialization
 backgroundColor: #fff
 ---
 <!-- _class: lead -->
 <style>
 section {
    font-size: 25px;
 }
 blockquote {
    padding: 15px;
    background-color: #f0f0ff;
 }
 div.side-by-side {
    display: flex;
    justify-content: space-between;
    gap: 20px;
 }
 div.side-by-side > div {
    flex: 1;
 }
 </style>
 # **Type-Level Programming in Chapel for Compile-Time Specialization**
 Daniel Fedorin, HPE
 ---
 # Compile-Time Programming in Chapel
 * **Type variables**, as their name suggests, store types instead of values.
  ```Chapel
  type myArgs = (int, real);
  ```
 * **Procedures with `type` return intent** can construct new types.
  ```Chapel
  proc toNilableIfClassType(type arg) type do
    if isNonNilableClassType(arg) then return arg?; else return arg;
  ```
 * **`param` variables** store values that are known at compile-time.
  ```Chapel
  param numberOfElements = 3;
  var threeInts: numberOfElements * int;
  ```
 * **Compile-time conditionals** are inlined at compile-time.
  ```Chapel
  if false then somethingThatWontCompile();
  ```
 ---
 # Restrictions on Compile-Time Programming
 * Compile-time operations do not have mutable state.
  - Cannot change values of `param` or `type` variables.
 * Chapel's compile-time programming does not support loops.
  - `param` loops are kind of an aception, but are simply unrolled.
  - Without mutability, this unrolling doesn't give us much.
 * Without state, our `type` and `param` functions are pure.
 ---
 # Did someone say "pure"?
 I can think of another language that has pure functions...
 * :white_check_mark: Haskell doesn't have mutable state by default.
 * :white_check_mark: Haskell doesn't have imperative loops.
 * :white_check_mark: Haskell functions are pure.
 ---
 # Programming in Haskell
 Without mutability and loops, Haskell programmers use pattern-matching and recursion to express their algorithms.
 * Data structures are defined by enumerating their possible cases. A list is either empty, or a head element followed by a tail list.
  ```Haskell
  data ListOfInts = Nil | Cons Int ListOfInts
  -- [] = Nil
  -- [1] = Cons 1 Nil
  -- [1,2,3] = Cons 1 (Cons 2 (Cons 3 Nil))
  ```
 * Pattern-matching is used to examine the cases of a data structure and act accordingly.
  ```Haskell
  sum :: ListOfInts -> Int
  sum Nil = 0
  sum (Cons i tail) = i + sum tail
  ```
 ---
 # Evaluating Haskell
 Haskell simplifies calls to functions by pickign the case based on the arguments.
 ```Haskell
 sum (Cons 1 (Cons 2 (Cons 3 Nil)))
 -- case: sum (Cons i tail) = i + sum tail
 = 1 + sum (Cons 2 (Cons 3 Nil))
 -- case: sum (Cons i tail) = i + sum tail
 = 1 + (2 + sum (Cons 3 Nil))
 -- case: sum (Cons i tail) = i + sum tail
 = 1 + (2 + (3 + sum Nil))
 -- case: sum Nil = 0
 = 1 + (2 + (3 + 0))
 = 6
 ```
 ---
 # A Familiar Pattern
 Picking a case based on the arguments is very similar to Chapel's function overloading.
 * **A very familiar example**:
  ```Chapel
  proc foo(x: int) { writeln("int"); }
  proc foo(x: real) { writeln("real"); }
  foo(1);   // prints "int"
  ```
 * **A slightly less familiar example**:
  ```Chapel
  proc foo(type x: int) { compilerWarning("int"); }
  proc foo(type x: real) { compilerWarning("real"); }
  foo(int);   // compiler prints "int"
  ```
 ---
 # A Type-Level List
 Hypothesis: we can use Chapel's function overloading and types to write functional-ish programs.
 <div class="side-by-side">
  <div>
 ```Chapel
 record Nil {}
 record Cons { param head: int; type tail; }
 type myList = Cons(1, Cons(2, Cons(3, Nil)));
 proc sum(type x: Nil) param do return 0;
 proc sum(type x: Cons(?i, ?tail)) param do return i + sum(tail);
 compilerWarning(sum(myList) : string); // compiler prints 6
 ```
  </div>
  <div>
 ```Haskell
 data ListOfInts = Nil
                | Cons Int ListOfInts
 myList = Cons 1 (Cons 2 (Cons 3 Nil))
 sum :: ListOfInts -> Int
 sum Nil = 0
 sum (Cons i tail) = i + sum tail
 ```
  </div>
 </div>
 ---
 # Type-Level Programming at Compile-Time
 After resolution, our original program:
 ```Chapel
 record Nil {}
 record Cons { param head: int; type tail; }
 type myList = Cons(1, Cons(2, Cons(3, Nil)));
 proc sum(type x: Nil) param do return 0;
 proc sum(type x: Cons(?i, ?tail)) param do return i + sum(tail);
 writeln(sum(myList) : string); // compiler prints 6
 ```
 Becomes:
 ```Chapel
 writeln("6");
 ```
 There is no runtime overhead!
 ---
 # Type-Level Programming at Compile-Time
 > Why would I want to do this?!
 >
 > \- You, probably
 * Do you want to write parameterized code, without paying runtime overhead for the runtime parameters?
  - **Worked example**: linear multi-step method approximator
 * Do you want to have powerful compile-time checks and constraints on your function types?
  - **Worked example**: type-safe `printf` function
 ---
 <!-- _class: lead -->
 # Linear Multi-Step Method Approximator
 ---
 # Euler's Method
 A first-order differential equation can be written in the following form:
 $$
 y' = f(t, y)
 $$
 In other words, the derivative of of $y$ depends on $t$ and $y$ itself. There is no solution to this equation in general; we have to approximate.
 If we know an initial point $(t_0, y_0)$, we can approximate other points. To get the point at $t_1 = t_0 + h$, we can use the formula:
 $$
 \begin{align*}
 y'(t_0) & = f(t_0, y_0) \\
 y(t_0+h) & \approx y_0 + h \times y'(t_0) \\
         & \approx y_0 + h \times f(t_0, y_0)
 \end{align*}
 $$
 We can name the the first approximated $y$-value $y_1$, and set it:
 $$
 y_1 = y_0 + h \times f(t_0, y_0)
 $$
 ---
 # Euler's Method
 On the previous slide, we got a new point $(t_1, y_1)$. We can repeat the process to get $y_2$:
 $$
 \begin{array}{c}
 y_2 = y_1 + h \times f(t_1, y_1) \\
 y_3 = y_2 + h \times f(t_2, y_2) \\
 y_4 = y_3 + h \times f(t_3, y_3) \\
 \cdots \\
 y_{n+1} = y_n + h \times f(t_n, y_n) \\
 \end{array}
 $$
 ---
 # Euler's Method in Chapel
 This can be captured in a simple Chapel procedure:
 ```Chapel
 proc runEulerMethod(step: real, count: int, t0: real, y0: real) {
    var y = y0;
    var t = t0;
    for i in 1..count {
        y += step*f(t,y);
        t += step;
    }
    return y;
 }
 ```
 ---
 # Other Methods
 * In Euler's method, we look at the slope of a function at a particular point, and use it to extrapolate the next point.
 * Once we've computed a few points, we have more information we can incorporate.
    - When computing $y_2$, we can use both $y_0$ and $y_1$.
    - To get a good approximation, we have to weight the points differently.
      $$
      y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)
      $$
    - More points means better accuracy, but more computation.
 * There are other methods that use more points and different weights.
    - Another method is as follows:
      $$
      y_{n+3} = y_{n+2} + h \left(\frac{23}{12}f(t_{n+2}, y_{n+2}) - \frac{16}{12}f(t_{n+1}, y_{n+1}) + \frac{5}{12}f(t_{n}, y_{n})\right)
      $$
 ---
 # Generalizing Multi-Step Methods
 Explicit Adams-Bashforth methods in general can be encoded as the coefficients used to weight the previous points.
 | Method                    | Equation                             | Coefficient List
 |---------------------------|--------------------------------------|-----------------
 | Euler's method            | $y_{n+1} = y_n + h \times f(t_n, y_n)$ | $1$
 | Two-step A.B.             | $y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)$ | $\frac{3}{2},-\frac{1}{2}$
 ---
 # Generalizing Multi-Step Methods
 Explicit Adams-Bashforth methods in general can be encoded as the coefficients used to weight the previous points.
 | Method                    | Equation                             | Chapel Type Expression
 |---------------------------|--------------------------------------|-----------------
 | Euler's method            | $y_{n+1} = y_n + h \times f(t_n, y_n)$ | `Cons(1,Nil)`
 | Two-step A.B.             | $y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)$ | `Cons(3/2,Cons(-1/2, Nil))`
 ---
 # Supporting Functions for Coefficient Lists
 ```Chapel
 proc length(type x: Cons(?w, ?t)) param do return 1 + length(t);
 proc length(type x: Nil)          param do return 0;
 proc coeff(param x: int, type lst: Cons(?w, ?t)) param where x == 0 do return w;
 proc coeff(param x: int, type lst: Cons(?w, ?t)) param where x  > 0 do return coeff(x-1, t);
 ```
 ---
 # A General Solver
 ```Chapel
 proc runMethod(type method, h: real, count: int, start: real,
               in ys: real ... length(method)): real {
 ```
 * `type method` accepts a type-level list of coefficients.
 * `h` encodes the step size.
 * `start` is $t_0$, the initial time.
 * `count` is the number of steps to take.
 * `in ys` makes the function accept as many `real` values (for $y_0, y_1, \ldots$) as there are weights
 ---
 # A General Solver
 ```Chapel
 param coeffCount = length(method);
 // Repeat the methods as many times as requested
 for i in 1..count {
    // We're computing by adding h*b_j*f(...) to y_n.
    // Set total to y_n.
    var total = ys(coeffCount - 1);
    // 'for param' loops are unrolled at compile-time -- this is just
    // like writing out each iteration by hand.
    for param j in 1..coeffCount do
        // For each coefficient b_j given by coeff(j, method),
        // increment the total by h*bj*f(...)
        total += step * coeff(j, method) *
            f(start + step*(i-1+coeffCount-j), ys(coeffCount-j));
    // Shift each y_i over by one, and set y_{n+s} to the
    // newly computed total.
    for param j in 0..< coeffCount - 1 do
        ys(j) = ys(j+1);
    ys(coeffCount - 1) = total;
 }
 // return final y_{n+s}
 return ys(coeffCount - 1);
 ```
 ---
 # Using the General Solver
 ```Chapel
 type euler = cons(1.0, empty);
 type adamsBashforth = cons(3.0/2.0, cons(-0.5, empty));
 type someThirdMethod = cons(23.0/12.0, cons(-16.0/12.0, cons(5.0/12.0, empty)));
 ```
 Take a simple differential equation $y' = y$. For this, define `f` as follows:
 ```Chapel
 proc f(t: real, y: real) do return y;
 ```
 Now, we can run Euler's method like so:
 ```Chapel
 writeln(runMethod(euler, step=0.5, count=4, start=0, 1)); // 5.0625
 ```
 To run the 2-step Adams-Bashforth method, we need two initial values:
 ```Chapel
 var y0 = 1.0;
 var y1 = runMethod(euler, step=0.5, count=1, start=0, 1);
 writeln(runMethod(adamsBashforth, step=0.5, count=3, start=0.5, y0, y1)); // 6.02344
 ```
 ---
 # The General Solver
 We can now construct solvers for any explicit Adams-Bashforth method, without writing any new code.
 ---
 <!-- _class: lead -->
 # Type-Safe `printf`
 ---
 # The `printf` Function
 The `printf` function accepts a format string, followed by a variable number of arguments that should match:
 ```C
 // totally fine:
 printf("Hello, %s! Your ChapelCon submission is #%d\n", "Daniel", 18);
 // not good:
 printf("Hello, %s! Your ChapelCon submission is #%d\n", 18, "Daniel");
 ```
 Can we define a `printf` function in Chapel that is type-safe?
 ---
 # Yet Another Type-Level List
 - The general idea for type-safe `printf`: take the format string, and extract a list of the expected argument types.
 - To make for nicer error messages, include a human-readable description of each type in the list.
 - I've found it more convenient to re-define lists for various problems when needed, rather than having a single canonical list definition.
 ```chapel
 record _nil {
  proc type length param do return 0;
 }
 record _cons {
  type expectedType;  // type of the argument to printf
  param name: string; // human-readable name of the type
  type rest;
  proc type length param do return 1 + rest.length();
 }
 ```
 ---
 # Extracting Types from Format Strings
 ```Chapel
 proc specifiers(param s: string, param i: int) type {
  if i >= s.size then return _nil;
  if s[i] == "%" {
    if i + 1 >= s.size then
        compilerError("Invalid format string: unterminted %");
    select s[i + 1] {
      when "%" do return specifiers(s, i + 2);
      when "s" do return _cons(string, "a string", specifiers(s, i + 2));
      when "i" do return _cons(int, "a signed integer", specifiers(s, i + 2));
      when "u" do return _cons(uint, "an unsigned integer", specifiers(s, i + 2));
      when "n" do return _cons(numeric, "a numeric value", specifiers(s, i + 2));
      otherwise do compilerError("Invalid format string: unknown format type");
    }
  } else {
    return specifiers(s, i + 1);
  }
 }
 ```
 ---
 # Extracting Types from Format Strings
 Let's give it a quick try:
 ```Chapel
 writeln(specifiers("Hello, %s! Your ChapelCon submission is #%i\n", 0) : string);
 ```
 The above prints:
 ```Chapel
 _cons(string,"a string",_cons(int(64),"a signed integer",_nil))
 ```
 ---
 # Validating Argument Types
 * The Chapel standard library has a nice `isSubtype` function that we can use to check if an argument matches the expected type.
 * Suppose the `.length` of our type specifiers matches the number of arguments to `printf`
 * Chapel doesn't currently support empty tuples, so if the lengths match, we know that `specifiers` is non-empty.
 * Then, we can validate the types as follows:
  ```Chapel
  proc validate(type specifiers: _cons(?t, ?s, ?rest), type argTup, param idx) {
    if !isSubtype(argTup[idx], t) then
      compilerError("Argument " + (idx + 1) : string + " should be " + s + " but got " + argTup[idx]:string, idx+2);
    if idx + 1 < argTup.size then
      validate(rest, argTup, idx + 1);
  }
  ```
 * The `idx+2` argument to `compilerError` avoids printing the recursive `validate` calls in the error message.
 ---
 # The `fprintln` overloads
 * I named it `fprintln` for "formatted print line".
 * To support the empty-specifier case (Chapel varargs don't allow zero arguments):
  ```Chapel
  proc fprintln(param format: string) where specifiers(format, 0).length == 0 {
    writeln(format);
  }
  ```
 * If we do have type specifiers, to ensure our earlier assumption of `size` matching:
  ```Chapel
  proc fprintln(param format: string, args...)
    where specifiers(format, 0).length != args.size {
    compilerError("'fprintln' with this format string expects " +
                  specifiers(format, 0).length : string +
                  " argument(s) but got " + args.size : string);
  }
  ```
 ---
 # The `fprintln` overloads
 * All that's left is the main `fprintln` implementation:
  ```Chapel
  proc fprintln(param format: string, args...) {
    validate(specifiers(format, 0), args.type, 0);
    writef(format + "\n", (...args));
  }
  ```
 ---
 # Using `fprintln`
 ```Chapel
 fprintln("Hello, world!");        // fine, prints "Hello, world!"
 fprintln("The answer is %i", 42); // fine, prints "The answer is 42"
 // compiler error: Argument 3 should be a string but got int(64)
 fprintln("The answer is %i %i %s", 1, 2, 3);
 ```
 More work could be done to support more format specifiers, escapes, etc., but the basic idea is there.
 ---
 <!-- _class: lead -->
 # Beyond Lists
 ---
 # Beyond Lists
 * I made grand claims earlier
    - "Write functional-ish program at the type level!"
 * So far, we've just used lists and some recursion.
 * Is that all there is?
 ---
 # Algebraic Data Types
 * The kinds of data types that Haskell supports are called *algebraic data types*.
 * At a fundamental level, they can be built up from two operations: _cartesian product_ and _disjoint union_.
 * There are other concepts to build recursive data types, but we won't need them in Chapel.
  - To prove to you I know what I'm talking about, some jargon:
    _initial algebras_, _the fixedpoint functor_, _catamorphisms_...
  - Check out _Bananas, Lenses, Envelopes and Barbed Wire_ by Meijer et al. for more.
 * __This matters because, if Chapel has these operations, we can build any data type that Haskell can.__
 ---
 <style scoped>
 li:nth-child(3) { color: lightgrey; }
 </style>
 # Algebraic Data Types
 - The kinds of data types that Haskell supports are called *algebraic data types*.
 - At a fundamental level, they can be built up from two operations: _cartesian product_ and _disjoint union_.
 - There are other concepts to build recursive data types, but we won't need them in Chapel.
  - To prove to you I know what I'm talking about, some jargon:
    _initial algebras_, _the fixedpoint functor_, _catamorphisms_...
  - Check out _Bananas, Lenses, Envelopes and Barbed Wire_ by Meijer et al. for more.
 - __This matters because, if Chapel has these operations, we can build any data type that Haskell can.__
 ---
 # Cartesian Product
 For any two types, the _cartesian product_ of these two types defines all pairs of values from these types.
  - This is like a two-element tuple _at the value level_ in Chapel.
  - We write this as $A \times B$ for two types $A$ and $B$.
  - In (type-level) Chapel and Haskell:
    <div class=side-by-side>
      <div>
      ```Chapel
      record Pair {
          type fst;
          type snd;
      }
      type myPair = Pair(myVal1, myVal2);
      ```
      </div>
      <div>
      ```Haskell
      data Pair = MkPair
          { fst :: A
          , snd :: B
          }
      myPair = MkPair myVal1 myVal2
      ```
      </div>
    </div>
 ---
 # Disjoint Union
 For any two types, the _disjoint union_ of these two types defines values that are either from one type or the other.
  - This is _almost_ like a `union` in Chapel or C...
  - But there's extra information to tell us which of the two types the value is from.
  - We write this as $A + B$ for two types $A$ and $B$.
  - In Chapel and Haskell:
    <div class=side-by-side>
      <div>
      ```Chapel
      record InL { type value; }
      record InR { type value; }
      type myFirstCase = InL(myVal1);
      type mySecondCase = InR(myVal2);
      ```
      </div>
      <div>
      ```Haskell
      data Sum
        = InL A
        | InR B
      myFirstCase = InL myVal1
      mySecondCase = InR myVal2
      ```
      </div>
    </div>
 ---
 # Algebraic Data Types
 * We can build up more compelx types by combining these two operations.
    * Need a triple of types $A$, $B$, and $C$? Use $A \times (B \times C)$.
    * Similarly, "any one of three types" can be expressed as $A + (B + C)$.
    * A `Result<T>` type (in Rust, or `optional<T>` in C++) is $T + \text{Unit}$.
      * `Unit` is a type with a single value (there's only one `None` / `std::nullopt`).
 * Notice that in Chapel, we moved up one level
  | Thing | Chapel           | Haskell           |
  |-------|------------------|-------------------|
  | `Nil` | type             | value             |
  | `Cons`| type constructor | value constructor |
  | List  | **???**          | type              |
 ---
 # Algebraic Data Types
 * Since Chapel has no notion of a type-of-types, we can't enforce that our values are _only_ `InL` or `InR` (in the case of `Sum`).
 * This is why, in Chapel versions, type annotations like `A` and `B` are missing.
  <div class=side-by-side>
    <div>
    ```Chapel
    record Pair {
        type fst; /* : A */
        type snd; /* : B */
    }
    ```
    </div>
    <div>
    ```Haskell
    data Pair = MkPair
        { fst :: A
        , snd :: B
        }
    ```
    </div>
  </div>
 * So, we can't enforce that the user doesn't pass `int` to our `length` function defined on lists.
 * We also can't enforce that `InL` is instantiated with the right type.
 * So, we lose some safety compare to Haskell...
 * ...but we're getting the compiler to do arbitrary computations for us at compile-time.
 ---
 # Worked Example: Binary Search Tree
 In Haskell, binary search trees can be defined as follows:
 ```Haskell
 data BSTree = Empty
            | Node Int BSTree BSTree
 balancedOneTwoThree = Node 2 (Node 1 Empty Empty) (Node 3 Empty Empty)
 ```
 Written using Algebraic Data Types, this is:
 $$
 \text{BSTree} = \text{Unit} + (\text{Int} \times \text{BSTree} \times \text{BSTree})
 $$
 In Haskell (using sums and products):
 ```Haskell
 type BSTree' = Unit `Sum` (Int `Pair` (BSTree' `Pair` BSTree'))
 balancedOneTwoThree' = InR (2 `MkPair` (InR (1 `MkPair` (InL MkUnit `MkPair` InL Unit)) `MkPair`
                                        InR (3 `MkPair` (InL MkUnit `MkPair` InL Unit))))
 ```
 ---
 # Worked Example: Binary Search Tree
 * Recalling the Haskell version:
  ```Haskell
  type BSTree' = Unit `Sum` (Int `Pair` (BSTree' `Pair` BSTree'))
  balancedOneTwoThree' = InR (2 `MkPair` (InR (1 `MkPair` (InL MkUnit `MkPair` InL Unit)) `MkPair`
                                          InR (3 `MkPair` (InL MkUnit `MkPair` InL Unit))))
  ```
 * We can't define `BSTree'` in Chapel (no type-of-types), but we can define `balancedOneTwoThree'`:
  ```Chapel
  type balancedOneTwoThree =
    InR(Pair(2, Pair(InR(Pair(1, Pair(InL(), InL()))),
                     InR(Pair(3, Pair(InL(), InL()))))));
  ```
 * :white_check_mark: We can use algebraic data types to build arbitrarily complex data structures  ◼.
 ---
 # Returning to Pragmatism
 * We could've defined our list type in terms of `InL`, `InR`, and `Pair`.
 * However, it was cleaner to make it look more like the non-ADT Haskell version.
 * Recall that it looked like this:
  <div class="side-by-side">
    <div>
  ```Chapel
  record Nil {}
  record Cons { param head: int; type tail; }
  type myList = Cons(1, Cons(2, Cons(3, Nil)));
  ```
    </div>
    <div>
  ```Haskell
  data ListOfInts = Nil
                  | Cons Int ListOfInts
  myList = Cons 1 (Cons 2 (Cons 3 Nil))
  ```
    </div>
  </div>
 * We can do the same thing for our binary search tree:
  <div class="side-by-side">
    <div>
  ```Chapel
  record Empty {}
  record Node { param value: int; type left; type right; }
  type balancedOneTwoThree = Node(2, Node(1, Empty, Empty),
                                     Node(3, Empty, Empty));
  ```
    </div>
    <div>
  ```Haskell
  data BSTree = Empty
              | Node Int BSTree BSTree
  balancedOneTwoThree = Node 2 (Node 1 Empty Empty)
                               (Node 3 Empty Empty)
  ```
    </div>
  </div>
 ---
 # A General Recipe
 To translate a Haskell data type definition to Chapel:
 * For each constructor, define a `record` with that constructor's name
 * The fields of that record are `type` fields for each argument of the constructor
  - If the argument is a value (like `Int`), you can make it a `param` field instead
 * A visual example, again:
  <div class="side-by-side">
    <div>
  ```Chapel
  record C1 { type arg1; /* ... */ type argi; }
  // ...
  record Cn { type arg1; /* ... */ type argj; }
  ```
    </div>
    <div>
  ```Haskell
  data T = C1 arg1 ... argi
         | ...
         | Cn arg1 ... argj
  ```
    </div>
  </div>
 ---
 # Inserting and Looking Up in a BST
 <div class="side-by-side">
  <div>
 ```Chapel
 proc insert(type t: Empty, param x: int) type do return Node(x, Empty, Empty);
 proc insert(type t: Node(?v, ?left, ?right), param x: int) type do
    select true {
      when x < v do return Node(v, insert(left, x), right);
      otherwise do return Node(v, left, insert(right, x));
    }
 type test = insert(insert(insert(Empty, 2), 1), 3);
 proc lookup(type t: Empty, param x: int) param do return false;
 proc lookup(type t: Node(?v, ?left, ?right), param x: int) param do
    select true {
      when x == v do return true;
      when x < v do return lookup(left, x);
      otherwise do return lookup(right, x);
    }
 ```
  </div>
  <div>
 ```Haskell
 insert :: Int -> BSTree -> BSTree
 insert x Empty = Node x Empty Empty
 insert x (Node v left right)
    | x < v     = Node v (insert x left) right
    | otherwise = Node v left (insert x right)
 test = insert 3 (insert 1 (insert 2 Empty))
 lookup :: Int -> BSTree -> Bool
 lookup x Empty = False
 lookup x (Node v left right)
    | x == v    = True
    | x < v     = lookup x left
    | otherwise = lookup x right
 ```
  </div>
 </div>
 It really works!
 ```Chapel
 writeln(test : string);
 // prints Node(2,Node(1,Empty,Empty),Node(3,Empty,Empty))
 writeln(lookup(test, 1));
 // prints true for this one, but false for '4'
 ```
 ---
 # A Key-Value Map
 ```Chapel
 record Empty {}
 record Node { param key: int; param value; type left; type right; }
 proc insert(type t: Empty, param k: int, param v) type do return Node(k, v, Empty, Empty);
 proc insert(type t: Node(?k, ?v, ?left, ?right), param nk: int, param nv) type do
    select true {
      when nk < k do return Node(k, v, insert(left, nk, nv), right);
      otherwise do return Node(k, v, left, insert(right, nk, nv));
    }
 proc lookup(type t: Empty, param k: int) param do return "not found";
 proc lookup(type t: Node(?k, ?v, ?left, ?right), param x: int) param do
    select true {
      when x == k do return v;
      when x < k do return lookup(left, x);
      otherwise do return lookup(right, x);
    }
 type test = insert(insert(insert(Empty, 2, "two"), 1, "one"), 3, "three");
 writeln(lookup(test, 1)); // prints "one"
 writeln(lookup(test, 3)); // prints "three"
 writeln(lookup(test, 4)); // prints "not found"
 ```
 ---
 # Conclusion
 * Chapel's type-level programming is surprisingly powerful.
 * We can write compile-time programs that are very similar to Haskell programs.
 * This allows us to write highly parameterized code without paying runtime overhead.
 * This also allows us to devise powerful compile-time checks and constraints on our code.
 * This approach allows for general-purpose programming, which can be applied to `your use-case`