Add my initial slides
Signed-off-by: Danila Fedorin <daniel.fedorin@hpe.com>
This commit is contained in:
commit
59fea61d2d
22
alloy/abstract.txt
Normal file
22
alloy/abstract.txt
Normal file
@ -0,0 +1,22 @@
|
||||
Formal methods are a set of techniques that are used to validate the correctness
|
||||
of software. A particular category of these methods, model checking, uses the
|
||||
mathematical language of temporal logic to construct specifications of software’s
|
||||
behavior. A solver can then validate the constraints described in the formal
|
||||
language and ensure that undesirable states do not occur.
|
||||
|
||||
This talk will be an experience report of using formal methods, specifically
|
||||
the Alloy analyzer, to detect a bug in Chapel’s ‘Dyno’ compiler front-end library.
|
||||
The area in which the bug was discovered is currently used in production, as
|
||||
well as a part of editor tools such as chplcheck and chpl-language-server.
|
||||
|
||||
Specifically, Alloy was used to construct a formal specification of a part of
|
||||
Chapel’s use/import lookup algorithm. Chapel has a number of complicated scoping
|
||||
rules and possible edge cases in this area. By running this specification against
|
||||
a solver, a sequence of steps was discovered that could cause the algorithm to
|
||||
malfunction and produce incorrect results. A program that causes these steps to
|
||||
occur was constructed and served as a concrete reproducer for the bug. This
|
||||
reproducer was used to adjust the logic and fix the bug.
|
||||
|
||||
This talk will cover the fundamentals of temporal logic required for formal
|
||||
specifications, the necessary parts of Chapel’s use/import lookup algorithm,
|
||||
and the steps taken to encode and validate the compiler’s behavior.
|
BIN
alloy/bug.png
Normal file
BIN
alloy/bug.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 134 KiB |
BIN
alloy/instancefound.png
Normal file
BIN
alloy/instancefound.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 40 KiB |
688
alloy/slides.md
Normal file
688
alloy/slides.md
Normal file
@ -0,0 +1,688 @@
|
||||
---
|
||||
theme: gaia
|
||||
title: Using Formal Methods to Discover a Bug in the Chapel Compiler
|
||||
backgroundColor: #fff
|
||||
---
|
||||
<!-- _class: lead -->
|
||||
<style>
|
||||
section {
|
||||
font-size: 25px;
|
||||
}
|
||||
blockquote {
|
||||
padding: 15px;
|
||||
background-color: #f0f0ff;
|
||||
}
|
||||
|
||||
div.side-by-side {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
gap: 20px;
|
||||
}
|
||||
|
||||
div.side-by-side > div {
|
||||
flex: 1;
|
||||
}
|
||||
</style>
|
||||
|
||||
# <!--fit--> **Using Formal Methods to Discover a Bug in the Chapel Compiler**
|
||||
Daniel Fedorin, HPE
|
||||
|
||||
---
|
||||
|
||||
# Terms
|
||||
|
||||
* What are **formal methods**?
|
||||
* Techniques rooted in computer science and mathematics to specify and verify systems
|
||||
* What part of the **Chapel compiler**?
|
||||
* The 'Dyno' compiler front-end, particularly its use/import resolution phase.
|
||||
* This piece is used by the production Chapel compiler.
|
||||
|
||||
---
|
||||
|
||||
# The Story
|
||||
|
||||
I have a story in three parts.
|
||||
|
||||
1. I found it very hard to think through a section of compiler code.
|
||||
- Specifically, code that performed lookups in `use`s/`import`s
|
||||
|
||||
2. I used the [Alloy Analyzer](https://alloytools.org/) to model the assumptions and behavior of the code.
|
||||
- I had little background in Alloy, but some background in (formal) logic
|
||||
|
||||
3. This led me to discover a bug in the compiler that I then fixed.
|
||||
- Re-creating the bug required some gymanstics that were unlikely to occur in practice.
|
||||
|
||||
---
|
||||
|
||||
<!-- _class: lead -->
|
||||
# Background: Chapel's Scope Lookups
|
||||
|
||||
---
|
||||
|
||||
# The Humble `foo`
|
||||
Imagine you see the following snippet of Chapel code:
|
||||
|
||||
```chapel
|
||||
foo();
|
||||
```
|
||||
|
||||
Where do you look for `foo`? The answer is quite complicated, and depends
|
||||
strongly on the context of the call.
|
||||
|
||||
Moreover, the order of where to look matters: method calls are preferred
|
||||
over global functions, "nearer" functions are preferred over "farther" ones.
|
||||
|
||||
---
|
||||
|
||||
# The Humble `foo` (example 1)
|
||||
|
||||
```chapel
|
||||
module M1 {
|
||||
record R {
|
||||
proc foo() { writeln("R.foo"); }
|
||||
}
|
||||
proc foo() { writeln("M1.foo"); }
|
||||
|
||||
foo(); // which 'foo'?
|
||||
}
|
||||
```
|
||||
|
||||
Here, things are pretty straightforward: we look in scope `M1`. `R.foo` is not
|
||||
in it, but `M1.foo` is. We return it.
|
||||
|
||||
---
|
||||
|
||||
# The Humble `foo` (example 2)
|
||||
|
||||
```chapel
|
||||
module M1 {
|
||||
record R {}
|
||||
|
||||
proc R.foo() { writeln("R.foo"); }
|
||||
proc foo() { writeln("M1.foo"); }
|
||||
|
||||
foo(); // which 'foo'?
|
||||
}
|
||||
```
|
||||
|
||||
Here, things are bit trickier. Since we are not inside a method, we know
|
||||
that `foo()` could not be a call to a method. Thus, we rule out `R.foo`,
|
||||
and find `M1.foo` in `M1`.
|
||||
|
||||
---
|
||||
|
||||
# The Humble `foo` (example 3)
|
||||
|
||||
```chapel
|
||||
module M1 {
|
||||
record R {}
|
||||
|
||||
proc R.foo() { writeln("R.foo"); }
|
||||
proc foo() { writeln("M1.foo"); }
|
||||
|
||||
proc R.someMethod() {
|
||||
foo(); // which 'foo'?
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
* Both `R.foo` and `M1.foo` would be valid candidates.
|
||||
* We give priority to methods over global functions. So, the compiler would:
|
||||
* Search `R` and its scope (`M1`) for methods named `foo`.
|
||||
* If that fails, search `M1` for any symbols `foo`.
|
||||
* We've had to look at `M1` twice! (once for methods, once for non-methods)
|
||||
|
||||
---
|
||||
|
||||
# The Humble `foo` (example 4)
|
||||
|
||||
```chapel
|
||||
module M1 {
|
||||
record R {}
|
||||
|
||||
proc foo() { writeln("R.foo"); }
|
||||
}
|
||||
module M2 {
|
||||
use M1;
|
||||
|
||||
proc R.someMethod() {
|
||||
foo(); // which 'foo'?
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Here, we search the scope of `R` and `M1`, but **only for public symbols**.
|
||||
|
||||
---
|
||||
|
||||
|
||||
# How Chapel's Compiler Handles This
|
||||
|
||||
We want to:
|
||||
- Respect the priority order
|
||||
- Including prefering methods over non-methods
|
||||
- As a result, we search the scopes multiple times
|
||||
- Avoid any extra work
|
||||
- This includes redundant re-searches
|
||||
- Example redundant search: looked up "all symbols", then later "all public symbols"
|
||||
|
||||
```C++
|
||||
enum { PUBLIC = 1, NOT_PUBLIC = 2, METHOD_FIELD = 4, NOT_METHOD_FIELD = 8, /* ... */ };
|
||||
```
|
||||
|
||||
1. For each scope, save the flags we've already searched with
|
||||
2. When searching a scope again, exclude the flags we've already searched with
|
||||
|
||||
This was handled by two bitfields: `filter` and `excludeFilter`.
|
||||
|
||||
---
|
||||
|
||||
# Populating `filter` and `excludeFilter`
|
||||
|
||||
```C++
|
||||
if (skipPrivateVisibilities) { // depends on context of 'foo()'
|
||||
filter |= IdAndFlags::PUBLIC;
|
||||
}
|
||||
|
||||
if (onlyMethodsFields) {
|
||||
filter |= IdAndFlags::METHOD_FIELD;
|
||||
|
||||
} else if (!includeMethods && receiverScopes.empty()) {
|
||||
filter |= IdAndFlags::NOT_METHOD;
|
||||
}
|
||||
```
|
||||
|
||||
```C++
|
||||
excludeFilter = previousFilter;
|
||||
```
|
||||
|
||||
```C++
|
||||
// scary!
|
||||
previousFilter = filter & previousFilter;
|
||||
```
|
||||
|
||||
Code notes `previousFilter` is an approximation.
|
||||
|
||||
---
|
||||
# Possible Problems
|
||||
|
||||
* `previousFilter` is an approximation.
|
||||
* For `previousFilter = PUBLIC` and `filter = METHOD_FIELD`, we get
|
||||
`previousFilter = 0`, indicating no more searches should be donne.
|
||||
* But we're missing private non-methods!
|
||||
* However, no case we knew of hit this combination of searches, or any like it.
|
||||
* All of our language tests passed.
|
||||
* Code seemed to work.
|
||||
* If only there was a way I could get a computer to check whether such a combination
|
||||
could occur...
|
||||
|
||||
---
|
||||
|
||||
<!-- _class: lead -->
|
||||
# Formal Methods
|
||||
|
||||
---
|
||||
# Types of Formal Methods
|
||||
|
||||
- Model checking involves formally describing the behavior of a system, then having a solver check whether other desired properties hold.
|
||||
- Alloy is an example of a model checker.
|
||||
- TLA is another famous example.
|
||||
- Theorem proving is a heavier weight approach that involves building a formal proof of correctness.
|
||||
- Coq and Isabelle are examples of theorem provers.
|
||||
---
|
||||
<style scoped>
|
||||
li:nth-child(2) { color: lightgrey; }
|
||||
</style>
|
||||
# Types of Formal Methods
|
||||
|
||||
- Model checking involves formally describing the behavior of a system, then having a solver check whether other desired properties hold.
|
||||
- Alloy is an example of a model checker.
|
||||
- TLA is another famous example.
|
||||
- Theorem proving is a heavier weight approach that involves building a formal proof of correctness.
|
||||
- Coq and Isabelle are examples of theorem provers.
|
||||
|
||||
|
||||
|
||||
**Reason**: I was in the middle of developing compiler code. I wanted to sketch
|
||||
the assumptions I was making and see if they held up.
|
||||
|
||||
---
|
||||
|
||||
# A Primer on Logic
|
||||
|
||||
Model checkers like Alloy are rooted in temporal logic, which builds on
|
||||
first-order logic. This includes:
|
||||
|
||||
- Variables (e.g. $x$, $y$)
|
||||
- These represent any objects in the logical system.
|
||||
- Predicates (e.g. $P(x)$, $Q(x,y)$)
|
||||
- These represent properties of objects, or relationships between objects.
|
||||
- Logical connectives (e.g. $\land$, $\lor$, $\neg$, $\Rightarrow$)
|
||||
- These are used to combine predicates into more complex statements.
|
||||
- $\land$ is "and", $\lor$ is "or", $\neg$ is "not", $\Rightarrow$ is "implies".
|
||||
- Quantifiers (e.g. $\forall$, $\exists$)
|
||||
- $\forall x. P(x)$ (in Alloy: `all x { P(x) }`) means "for all $x$, $P(x)$ is true".
|
||||
- $\exists x. P(x)$ (in Alloy: `some x { P(x)}`) means "there exists an $x$ such that $P(x)$ is true".
|
||||
|
||||
---
|
||||
|
||||
# A Primer on Logic (example)
|
||||
|
||||
Example statement: "Bob has a son who likes all compilers".
|
||||
|
||||
$$
|
||||
\exists x. (\text{Son}(x, \text{Bob}) \land \forall y. (\text{Compiler}(y) \Rightarrow \text{Likes}(x, y)))
|
||||
$$
|
||||
|
||||
In Alloy:
|
||||
|
||||
```alloy
|
||||
some x { Son[x, Bob] and all y { Compiler[y] implies Likes[x, y] } }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A Primer on Temporal Logic
|
||||
|
||||
Temporal logic provides additional operators to reason about how properties change over time.
|
||||
|
||||
- $\Box p$ (in Alloy: `always p`): A statement that is always true.
|
||||
- $\Diamond p$ (in Alloy: `eventually p`) : A statement that will be true at some point in the future.
|
||||
|
||||
In Alloy specifically, we can mention the next state of a variable using `'`.
|
||||
|
||||
```alloy
|
||||
// pseudocode:
|
||||
// the next future value of previousFilter will be the intersection of filter
|
||||
// and the current value
|
||||
previousFilter' = filter & previousFilter;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A Primer on Temporal Logic (example)
|
||||
|
||||
Some examples:
|
||||
|
||||
$$
|
||||
\Box(\text{like charges repel})
|
||||
$$
|
||||
$$
|
||||
\Diamond(\text{the sun is in the sky})
|
||||
$$
|
||||
|
||||
In Alloy:
|
||||
|
||||
```alloy
|
||||
// likeChargesRepel and theSunIsInTheSky are predicates defined elsewhere
|
||||
|
||||
always likeChargesRepel
|
||||
|
||||
// good thing it's not `always`
|
||||
eventually theSunIsInTheSky
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Search Configuration in Alloy
|
||||
|
||||
Instead of duplcating `METHOD` and `NOT_METHOD`, use two sets of flags (the regular and the "not").
|
||||
|
||||
```alloy
|
||||
enum Flag {Method, MethodOrField, Public}
|
||||
|
||||
sig Bitfield {
|
||||
, positiveFlags: set Flag
|
||||
, negativeFlags: set Flag
|
||||
}
|
||||
|
||||
sig FilterState { , curFilter: Bitfield }
|
||||
|
||||
```
|
||||
|
||||
|
||||
__Takeaway__: We represent the search flags as a `Bitfield`, which encodes `PUBLIC`, `NOT_PUBLIC`, etc.
|
||||
|
||||
---
|
||||
|
||||
# Constructing Bitfields
|
||||
|
||||
Alloy doesn't allow us to define functions that somehow combine bitfields. We might want to write:
|
||||
|
||||
```chapel
|
||||
proc addFlag(b: Bitfield, flag: Flag): Bitfield {
|
||||
return Bitfield(b.positiveFlags + flag, b.negativeFlags);
|
||||
}
|
||||
```
|
||||
|
||||
Instead, we can _relate_ two bitfields using a predicate.
|
||||
|
||||
> This bitfield is like that bitfield, but with this flag added.
|
||||
|
||||
```chapel
|
||||
pred addBitfieldFlag[b1: Bitfield, b2: Bitfield, flag: Flag] {
|
||||
b2.positiveFlags = b1.positiveFlags + flag
|
||||
b2.negativeFlags = b1.negativeFlags
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Constructing Bitfields
|
||||
|
||||
Alloy doesn't allow us to define functions that somehow combine bitfields. We might want to write:
|
||||
|
||||
```chapel
|
||||
proc addFlag(b: Bitfield, flag: Flag): Bitfield {
|
||||
return Bitfield(b.positiveFlags + flag, b.negativeFlags);
|
||||
}
|
||||
```
|
||||
|
||||
Instead, we can _relate_ two bitfields using a predicate.
|
||||
|
||||
> This bitfield is exactly like that bitfield.
|
||||
|
||||
```chapel
|
||||
pred bitfieldEqual[b1: Bitfield, b2: Bitfield] {
|
||||
b1.positiveFlags = b2.positiveFlags and b1.negativeFlags = b2.negativeFlags
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Modeling Possible Searches
|
||||
|
||||
Alloy isn't an imperative language. We can't mutate variables like we do in C++. Instead, we model how each statement changes the state, by relating the "current" state to the "next" state.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```C++
|
||||
filter |= IdAndFlags::PUBLIC;
|
||||
```
|
||||
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```alloy
|
||||
addBitfieldFlag[filterNow, filterNext, Public]
|
||||
```
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
This might remind you of [Hoare Logic](https://en.wikipedia.org/wiki/Hoare_logic), where statements like:
|
||||
|
||||
$$
|
||||
\{ P \} \; s \; \{ Q \}
|
||||
$$
|
||||
|
||||
Read as:
|
||||
|
||||
> If $P$ is true before executing $s$, then $Q$ will be true after executing $s$.
|
||||
|
||||
$$
|
||||
\{ \text{filter} = \text{filterNow} \} \; \texttt{filter |= PUBLIC} \; \{ \text{filter} = \text{filterNext} \}
|
||||
$$
|
||||
|
||||
---
|
||||
|
||||
# Modeling Possible Searches
|
||||
|
||||
To combine several statements, we make it so that the "next" state of one statement is the "current" state of the next statement.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```C++
|
||||
curFilter |= IdAndFlags::PUBLIC;
|
||||
curFilter |= IdAndFlags::METHOD_FIELD;
|
||||
```
|
||||
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```alloy
|
||||
addBitfieldFlag[filterNow, filterNext1, Public]
|
||||
addBitfieldFlag[filterNext1, filterNext2, Method]
|
||||
```
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
This is reminiscent of sequencing hoare triples:
|
||||
|
||||
$$
|
||||
\{ P \} \; s_1 \; \{ Q \} \; s_2 \; \{ R \}
|
||||
$$
|
||||
|
||||
---
|
||||
|
||||
# Modeling Possible Searches
|
||||
|
||||
Finally, if C++ code has conditionals, we need to allow for the possibility of either branch being taken. We do this by using "or" on descriptions of the next state.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```C++
|
||||
if (skipPrivateVisibilities) {
|
||||
curFilter |= IdAndFlags::PUBLIC;
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
if (onlyMethodsFields) {
|
||||
curFilter |= IdAndFlags::METHOD_FIELD;
|
||||
} else if (!includeMethods && receiverScopes.empty()) {
|
||||
curFilter |= IdAndFlags::NOT_METHOD;
|
||||
}
|
||||
|
||||
|
||||
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```alloy
|
||||
addBitfieldFlag[initialState.curFilter, bitfieldMiddle, Public] or
|
||||
bitfieldEqual[initialState.curFilter, bitfieldMiddle]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
// If it's a method receiver, add method or field restriction
|
||||
addBitfieldFlag[bitfieldMiddle, filterState.curFilter, MethodOrField] or
|
||||
|
||||
// if it's not a receiver, filter to non-methods (could be overridden)
|
||||
addBitfieldFlagNeg[bitfieldMiddle, filterState.curFilter, Method] or
|
||||
|
||||
// Maybe methods are not being curFilterd but it's not a receiver, so no change.
|
||||
bitfieldEqual[bitfieldMiddle, filterState.curFilter]
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
Putting this into a predicate, `possibleState`, we encode what searches the compiler can undertake.
|
||||
|
||||
**Takeaway**: We encoded the logic that configures possible searches in Alloy. This instructs the analyzer about possible cases to consider.
|
||||
|
||||
---
|
||||
|
||||
# Modeling `previousFilter`
|
||||
|
||||
So far, all we've done is encoded what queries the compiler might make about a scope.
|
||||
|
||||
We still need to encode how we save the flags we've already searched with.
|
||||
|
||||
Model the search state with a "global" (really, unique) variable:
|
||||
|
||||
```alloy
|
||||
/* Initially, no search has happeneed for a scope, so its 'previousFilter' is not set to anything. */
|
||||
one sig NotSet {}
|
||||
|
||||
one sig SearchState {
|
||||
, var previousFilter: Bitfield + NotSet
|
||||
}
|
||||
```
|
||||
|
||||
Above, `+` is used for union. `previousFilter` can either be a `Bitfield` or `NotSet`.
|
||||
|
||||
---
|
||||
|
||||
# Modeling `previousFilter`
|
||||
|
||||
If no previous search has happened, we set `previousFilter` to the current `filter`.
|
||||
|
||||
Otherwise, we set `previousFilter` to the intersection of `filter` and `previousFilter`, as mentioned before.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```C++
|
||||
if (hasPrevious) {
|
||||
previousFilter = filter & previousFilter;
|
||||
} else {
|
||||
previousFilter = filter;
|
||||
}
|
||||
```
|
||||
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```alloy
|
||||
pred update[toSet: Bitfield + NotSet, setTo: FilterState] {
|
||||
toSet' != NotSet and bitfieldIntersection[toSet, setTo.include, toSet']
|
||||
}
|
||||
|
||||
pred updateOrSet[toSet: Bitfield + NotSet, setTo: FilterState] {
|
||||
(toSet = NotSet and toSet' = setTo.include) or
|
||||
(toSet != NotSet and update[toSet, setTo])
|
||||
}
|
||||
```
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Putting it Together
|
||||
|
||||
We now have a model of what our C++ program is doing: it computes some set of filter flags, then runs a search, excluding the previous flags. It then updates the previous flags with the current search. We can encode
|
||||
this as follows:
|
||||
|
||||
```
|
||||
fact step {
|
||||
always {
|
||||
// Model that a new doLookupInScope could've occurred, with any combination of flags.
|
||||
all searchState: SearchState {
|
||||
some fs: FilterState {
|
||||
// This is a possible combination of lookup flags
|
||||
possibleState[fs]
|
||||
|
||||
// If a search has been performed before, take the intersection; otherwise,
|
||||
// just insert the current filter flags.
|
||||
updateOrSet[searchState.previousFilter, fs]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
# Are there any bugs?
|
||||
|
||||
Model checkers ensure that all properties we want to hold, do hold. To find a counter example, we ask it to prove the negation of what we want.
|
||||
|
||||
```C
|
||||
wontFindNeeded: run {
|
||||
all searchState: SearchState {
|
||||
eventually some props: Props, fs: FilterState, fsBroken: FilterState {
|
||||
// Some search (fs) will cause a transition / modification of the search state...
|
||||
configureState[fs]
|
||||
updateOrSet[searchState.previousFilter, fs]
|
||||
// Such that a later, valid search... (fsBroken)
|
||||
configureState[fsBroken]
|
||||
|
||||
// Will allow for a set of properties...
|
||||
// ... that are left out of the original search...
|
||||
not bitfieldMatchesProperties[searchState.previousFilter, props]
|
||||
// ... and out of the current search
|
||||
not (bitfieldMatchesProperties[fs.include, props] and not bitfieldMatchesProperties[searchState.previousFilter, props])
|
||||
// But would be matched by the broken search...
|
||||
bitfieldMatchesProperties[fsBroken.include, props]
|
||||
// ... to not be matched by a search with the new state:
|
||||
not (bitfieldMatchesProperties[fsBroken.include, props] and not bitfieldOrNotSetMatchesProperties[searchState.previousFilter', props])
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
# Uh-Oh!
|
||||
|
||||

|
||||
|
||||
---
|
||||
# The Bug
|
||||
|
||||

|
||||
|
||||
We need some gymnastics to figure out what varibles make this model possible.
|
||||
|
||||
Alloy has a nice visualizer, but it has a lot of information.
|
||||
|
||||
In the interest of time, I found:
|
||||
|
||||
* If the compiler searches a scope firt for `PUBLIC` symbols, ...
|
||||
* ...then for `METHOD_OR_FIELD`, ...
|
||||
* ...then for any symbols, they will miss things!
|
||||
|
||||
---
|
||||
<style scoped>
|
||||
section {
|
||||
font-size: 20px;
|
||||
}
|
||||
</style>
|
||||
|
||||
# The Reproducer
|
||||
To trigger this sequence of searches, we needed a lot more gymnastics.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```chapel
|
||||
module TopLevel {
|
||||
module XContainerUser {
|
||||
public use TopLevel.XContainer;
|
||||
}
|
||||
module XContainer {
|
||||
private var x: int;
|
||||
record R {}
|
||||
module MethodHaver {
|
||||
use TopLevel.XContainerUser;
|
||||
use TopLevel.XContainer;
|
||||
proc R.foo() {
|
||||
var y = x;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
* the scope of `R` is searched with for methods
|
||||
* The scope of `R`’s parent (`XContainer`) is searched for methods
|
||||
* The scope of `XContainerUser` is searched for public symbols (via the `use`)
|
||||
* The scope of `XContainer` is searched with public symbols (via the `public use`)
|
||||
* The scope of `XContainer` searched for with no filters via the second use; but the stored filter is bad, so the lookup returns early, not finding `x`.
|
||||
|
||||
</div>
|
||||
</div>
|
32
type-level/abstract.txt
Normal file
32
type-level/abstract.txt
Normal file
@ -0,0 +1,32 @@
|
||||
Chapel’s type system can be surprisingly powerful. In addition to “usual”
|
||||
features such as generics and polymorphism, Chapel provides the ability to
|
||||
manipulate types using functions; this involves both taking types as arguments
|
||||
to functions and returning them from these functions. This can enable powerful
|
||||
programming techniques that are typically confined to the domain of
|
||||
metaprogramming.
|
||||
|
||||
For example, although Chapel’s notion of compile-time values — ‘param’s — is
|
||||
limited to primitive types such as integers, booleans, and strings, one can
|
||||
encode compile-time lists of these values as types. Such encodings can be used
|
||||
to create compile-time specializations of functions that would be otherwise
|
||||
tedious to write by hand. One use case for such specializations is the
|
||||
implementation of a family of functions for approximating differential
|
||||
equations, the Adams-Bashforth methods. Some variants of these methods can be
|
||||
encoded as lists of coefficients. Thus, it becomes possible to define a single
|
||||
function that accepts a type-level list of coefficients and produces a
|
||||
“stamped out” implementation of the corresponding method. This reduces the
|
||||
need to implement each method explicitly by hand. Another use case of function
|
||||
specialization is a type-safe ‘printf’ function that validates that users’
|
||||
format specifiers match the type of arguments to the function.
|
||||
|
||||
More generally, Chapel’s types can be used to encode algebraic sums (disjoint
|
||||
unions) and products (Cartesian) of types. This, in turn, makes it possible to
|
||||
build arbitrary data structures at the type level. The lists-of-values case
|
||||
above is an instance of this general principle. Functions can be defined on
|
||||
type-level data structures by relying on overloading and type arguments.
|
||||
Programming in this manner starts to resemble programming in a purely
|
||||
functional language such as Haskell.
|
||||
|
||||
Though this style of programming has not seen much use thus far, it can be a
|
||||
powerful technique for controlling types of arguments or constructing highly
|
||||
customized functions with no runtime overhead.
|
979
type-level/slides.md
Normal file
979
type-level/slides.md
Normal file
@ -0,0 +1,979 @@
|
||||
---
|
||||
theme: gaia
|
||||
title: Type-Level Programming in Chapel for Compile-Time Specialization
|
||||
backgroundColor: #fff
|
||||
---
|
||||
<!-- _class: lead -->
|
||||
<style>
|
||||
section {
|
||||
font-size: 25px;
|
||||
}
|
||||
blockquote {
|
||||
padding: 15px;
|
||||
background-color: #f0f0ff;
|
||||
}
|
||||
|
||||
div.side-by-side {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
gap: 20px;
|
||||
}
|
||||
|
||||
div.side-by-side > div {
|
||||
flex: 1;
|
||||
}
|
||||
</style>
|
||||
|
||||
# **Type-Level Programming in Chapel for Compile-Time Specialization**
|
||||
Daniel Fedorin, HPE
|
||||
|
||||
---
|
||||
|
||||
# Compile-Time Programming in Chapel
|
||||
|
||||
* **Type variables**, as their name suggests, store types instead of values.
|
||||
|
||||
```Chapel
|
||||
type myArgs = (int, real);
|
||||
```
|
||||
|
||||
* **Procedures with `type` return intent** can construct new types.
|
||||
|
||||
```Chapel
|
||||
proc toNilableIfClassType(type arg) type do
|
||||
if isNonNilableClassType(arg) then return arg?; else return arg;
|
||||
```
|
||||
* **`param` variables** store values that are known at compile-time.
|
||||
|
||||
```Chapel
|
||||
param numberOfElements = 3;
|
||||
var threeInts: numberOfElements * int;
|
||||
```
|
||||
* **Compile-time conditionals** are inlined at compile-time.
|
||||
|
||||
```Chapel
|
||||
if false then somethingThatWontCompile();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Restrictions on Compile-Time Programming
|
||||
|
||||
* Compile-time operations do not have mutable state.
|
||||
- Cannot change values of `param` or `type` variables.
|
||||
* Chapel's compile-time programming does not support loops.
|
||||
- `param` loops are kind of an aception, but are simply unrolled.
|
||||
- Without mutability, this unrolling doesn't give us much.
|
||||
* Without state, our `type` and `param` functions are pure.
|
||||
|
||||
---
|
||||
|
||||
# Did someone say "pure"?
|
||||
|
||||
I can think of another language that has pure functions...
|
||||
|
||||
* :white_check_mark: Haskell doesn't have mutable state by default.
|
||||
* :white_check_mark: Haskell doesn't have imperative loops.
|
||||
* :white_check_mark: Haskell functions are pure.
|
||||
|
||||
---
|
||||
|
||||
# Programming in Haskell
|
||||
|
||||
Without mutability and loops, Haskell programmers use pattern-matching and recursion to express their algorithms.
|
||||
|
||||
* Data structures are defined by enumerating their possible cases. A list is either empty, or a head element followed by a tail list.
|
||||
|
||||
```Haskell
|
||||
data ListOfInts = Nil | Cons Int ListOfInts
|
||||
|
||||
-- [] = Nil
|
||||
-- [1] = Cons 1 Nil
|
||||
-- [1,2,3] = Cons 1 (Cons 2 (Cons 3 Nil))
|
||||
```
|
||||
* Pattern-matching is used to examine the cases of a data structure and act accordingly.
|
||||
|
||||
```Haskell
|
||||
sum :: ListOfInts -> Int
|
||||
sum Nil = 0
|
||||
sum (Cons i tail) = i + sum tail
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Evaluating Haskell
|
||||
|
||||
Haskell simplifies calls to functions by pickign the case based on the arguments.
|
||||
|
||||
```Haskell
|
||||
sum (Cons 1 (Cons 2 (Cons 3 Nil)))
|
||||
|
||||
-- case: sum (Cons i tail) = i + sum tail
|
||||
= 1 + sum (Cons 2 (Cons 3 Nil))
|
||||
|
||||
-- case: sum (Cons i tail) = i + sum tail
|
||||
= 1 + (2 + sum (Cons 3 Nil))
|
||||
|
||||
-- case: sum (Cons i tail) = i + sum tail
|
||||
= 1 + (2 + (3 + sum Nil))
|
||||
|
||||
-- case: sum Nil = 0
|
||||
= 1 + (2 + (3 + 0))
|
||||
|
||||
= 6
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A Familiar Pattern
|
||||
|
||||
Picking a case based on the arguments is very similar to Chapel's function overloading.
|
||||
|
||||
* **A very familiar example**:
|
||||
```Chapel
|
||||
proc foo(x: int) { writeln("int"); }
|
||||
proc foo(x: real) { writeln("real"); }
|
||||
foo(1); // prints "int"
|
||||
```
|
||||
* **A slightly less familiar example**:
|
||||
```Chapel
|
||||
proc foo(type x: int) { compilerWarning("int"); }
|
||||
proc foo(type x: real) { compilerWarning("real"); }
|
||||
foo(int); // compiler prints "int"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A Type-Level List
|
||||
|
||||
Hypothesis: we can use Chapel's function overloading and types to write functional-ish programs.
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record Nil {}
|
||||
record Cons { param head: int; type tail; }
|
||||
|
||||
|
||||
type myList = Cons(1, Cons(2, Cons(3, Nil)));
|
||||
|
||||
|
||||
|
||||
proc sum(type x: Nil) param do return 0;
|
||||
proc sum(type x: Cons(?i, ?tail)) param do return i + sum(tail);
|
||||
|
||||
compilerWarning(sum(myList) : string); // compiler prints 6
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data ListOfInts = Nil
|
||||
| Cons Int ListOfInts
|
||||
|
||||
|
||||
myList = Cons 1 (Cons 2 (Cons 3 Nil))
|
||||
|
||||
|
||||
sum :: ListOfInts -> Int
|
||||
sum Nil = 0
|
||||
sum (Cons i tail) = i + sum tail
|
||||
|
||||
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Type-Level Programming at Compile-Time
|
||||
|
||||
After resolution, our original program:
|
||||
|
||||
```Chapel
|
||||
record Nil {}
|
||||
record Cons { param head: int; type tail; }
|
||||
|
||||
type myList = Cons(1, Cons(2, Cons(3, Nil)));
|
||||
|
||||
proc sum(type x: Nil) param do return 0;
|
||||
proc sum(type x: Cons(?i, ?tail)) param do return i + sum(tail);
|
||||
|
||||
writeln(sum(myList) : string); // compiler prints 6
|
||||
```
|
||||
|
||||
Becomes:
|
||||
|
||||
```Chapel
|
||||
writeln("6");
|
||||
```
|
||||
|
||||
There is no runtime overhead!
|
||||
|
||||
---
|
||||
|
||||
# Type-Level Programming at Compile-Time
|
||||
|
||||
> Why would I want to do this?!
|
||||
>
|
||||
> \- You, probably
|
||||
|
||||
* Do you want to write parameterized code, without paying runtime overhead for the runtime parameters?
|
||||
- **Worked example**: linear multi-step method approximator
|
||||
* Do you want to have powerful compile-time checks and constraints on your function types?
|
||||
- **Worked example**: type-safe `printf` function
|
||||
|
||||
---
|
||||
|
||||
<!-- _class: lead -->
|
||||
# Linear Multi-Step Method Approximator
|
||||
|
||||
---
|
||||
|
||||
# Euler's Method
|
||||
|
||||
A first-order differential equation can be written in the following form:
|
||||
|
||||
$$
|
||||
y' = f(t, y)
|
||||
$$
|
||||
|
||||
In other words, the derivative of of $y$ depends on $t$ and $y$ itself. There is no solution to this equation in general; we have to approximate.
|
||||
|
||||
If we know an initial point $(t_0, y_0)$, we can approximate other points. To get the point at $t_1 = t_0 + h$, we can use the formula:
|
||||
|
||||
$$
|
||||
\begin{align*}
|
||||
y'(t_0) & = f(t_0, y_0) \\
|
||||
y(t_0+h) & \approx y_0 + h \times y'(t_0) \\
|
||||
& \approx y_0 + h \times f(t_0, y_0)
|
||||
\end{align*}
|
||||
$$
|
||||
|
||||
We can name the the first approximated $y$-value $y_1$, and set it:
|
||||
|
||||
$$
|
||||
y_1 = y_0 + h \times f(t_0, y_0)
|
||||
$$
|
||||
|
||||
---
|
||||
|
||||
# Euler's Method
|
||||
|
||||
On the previous slide, we got a new point $(t_1, y_1)$. We can repeat the process to get $y_2$:
|
||||
|
||||
$$
|
||||
\begin{array}{c}
|
||||
y_2 = y_1 + h \times f(t_1, y_1) \\
|
||||
y_3 = y_2 + h \times f(t_2, y_2) \\
|
||||
y_4 = y_3 + h \times f(t_3, y_3) \\
|
||||
\cdots \\
|
||||
y_{n+1} = y_n + h \times f(t_n, y_n) \\
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
---
|
||||
|
||||
# Euler's Method in Chapel
|
||||
|
||||
This can be captured in a simple Chapel procedure:
|
||||
|
||||
```Chapel
|
||||
proc runEulerMethod(step: real, count: int, t0: real, y0: real) {
|
||||
var y = y0;
|
||||
var t = t0;
|
||||
for i in 1..count {
|
||||
y += step*f(t,y);
|
||||
t += step;
|
||||
}
|
||||
return y;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Other Methods
|
||||
|
||||
* In Euler's method, we look at the slope of a function at a particular point, and use it to extrapolate the next point.
|
||||
* Once we've computed a few points, we have more information we can incorporate.
|
||||
- When computing $y_2$, we can use both $y_0$ and $y_1$.
|
||||
- To get a good approximation, we have to weight the points differently.
|
||||
$$
|
||||
y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)
|
||||
$$
|
||||
|
||||
- More points means better accuracy, but more computation.
|
||||
|
||||
* There are other methods that use more points and different weights.
|
||||
- Another method is as follows:
|
||||
$$
|
||||
y_{n+3} = y_{n+2} + h \left(\frac{23}{12}f(t_{n+2}, y_{n+2}) - \frac{16}{12}f(t_{n+1}, y_{n+1}) + \frac{5}{12}f(t_{n}, y_{n})\right)
|
||||
$$
|
||||
|
||||
---
|
||||
|
||||
# Generalizing Multi-Step Methods
|
||||
|
||||
Explicit Adams-Bashforth methods in general can be encoded as the coefficients used to weight the previous points.
|
||||
|
||||
| Method | Equation | Coefficient List
|
||||
|---------------------------|--------------------------------------|-----------------
|
||||
| Euler's method | $y_{n+1} = y_n + h \times f(t_n, y_n)$ | $1$
|
||||
| Two-step A.B. | $y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)$ | $\frac{3}{2},-\frac{1}{2}$
|
||||
|
||||
---
|
||||
|
||||
# Generalizing Multi-Step Methods
|
||||
|
||||
Explicit Adams-Bashforth methods in general can be encoded as the coefficients used to weight the previous points.
|
||||
|
||||
| Method | Equation | Chapel Type Expression
|
||||
|---------------------------|--------------------------------------|-----------------
|
||||
| Euler's method | $y_{n+1} = y_n + h \times f(t_n, y_n)$ | `Cons(1,Nil)`
|
||||
| Two-step A.B. | $y_{n+2} = y_{n+1} + h \left(\frac{3}{2}f(t_{n+1}, y_{n+1}) - \frac{1}{2}f(t_{n}, y_{n})\right)$ | `Cons(3/2,Cons(-1/2, Nil))`
|
||||
|
||||
---
|
||||
|
||||
# Supporting Functions for Coefficient Lists
|
||||
|
||||
```Chapel
|
||||
proc length(type x: Cons(?w, ?t)) param do return 1 + length(t);
|
||||
proc length(type x: Nil) param do return 0;
|
||||
|
||||
proc coeff(param x: int, type lst: Cons(?w, ?t)) param where x == 0 do return w;
|
||||
proc coeff(param x: int, type lst: Cons(?w, ?t)) param where x > 0 do return coeff(x-1, t);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A General Solver
|
||||
|
||||
```Chapel
|
||||
proc runMethod(type method, h: real, count: int, start: real,
|
||||
in ys: real ... length(method)): real {
|
||||
```
|
||||
|
||||
* `type method` accepts a type-level list of coefficients.
|
||||
* `h` encodes the step size.
|
||||
* `start` is $t_0$, the initial time.
|
||||
* `count` is the number of steps to take.
|
||||
* `in ys` makes the function accept as many `real` values (for $y_0, y_1, \ldots$) as there are weights
|
||||
|
||||
---
|
||||
|
||||
# A General Solver
|
||||
|
||||
```Chapel
|
||||
param coeffCount = length(method);
|
||||
// Repeat the methods as many times as requested
|
||||
for i in 1..count {
|
||||
// We're computing by adding h*b_j*f(...) to y_n.
|
||||
// Set total to y_n.
|
||||
var total = ys(coeffCount - 1);
|
||||
// 'for param' loops are unrolled at compile-time -- this is just
|
||||
// like writing out each iteration by hand.
|
||||
for param j in 1..coeffCount do
|
||||
// For each coefficient b_j given by coeff(j, method),
|
||||
// increment the total by h*bj*f(...)
|
||||
total += step * coeff(j, method) *
|
||||
f(start + step*(i-1+coeffCount-j), ys(coeffCount-j));
|
||||
// Shift each y_i over by one, and set y_{n+s} to the
|
||||
// newly computed total.
|
||||
for param j in 0..< coeffCount - 1 do
|
||||
ys(j) = ys(j+1);
|
||||
ys(coeffCount - 1) = total;
|
||||
}
|
||||
// return final y_{n+s}
|
||||
return ys(coeffCount - 1);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Using the General Solver
|
||||
|
||||
|
||||
```Chapel
|
||||
type euler = cons(1.0, empty);
|
||||
type adamsBashforth = cons(3.0/2.0, cons(-0.5, empty));
|
||||
type someThirdMethod = cons(23.0/12.0, cons(-16.0/12.0, cons(5.0/12.0, empty)));
|
||||
```
|
||||
|
||||
Take a simple differential equation $y' = y$. For this, define `f` as follows:
|
||||
|
||||
```Chapel
|
||||
proc f(t: real, y: real) do return y;
|
||||
```
|
||||
|
||||
Now, we can run Euler's method like so:
|
||||
|
||||
```Chapel
|
||||
writeln(runMethod(euler, step=0.5, count=4, start=0, 1)); // 5.0625
|
||||
```
|
||||
|
||||
To run the 2-step Adams-Bashforth method, we need two initial values:
|
||||
|
||||
```Chapel
|
||||
var y0 = 1.0;
|
||||
var y1 = runMethod(euler, step=0.5, count=1, start=0, 1);
|
||||
writeln(runMethod(adamsBashforth, step=0.5, count=3, start=0.5, y0, y1)); // 6.02344
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# The General Solver
|
||||
|
||||
We can now construct solvers for any explicit Adams-Bashforth method, without writing any new code.
|
||||
|
||||
---
|
||||
|
||||
<!-- _class: lead -->
|
||||
|
||||
# Type-Safe `printf`
|
||||
|
||||
---
|
||||
|
||||
# The `printf` Function
|
||||
|
||||
The `printf` function accepts a format string, followed by a variable number of arguments that should match:
|
||||
|
||||
```C
|
||||
// totally fine:
|
||||
printf("Hello, %s! Your ChapelCon submission is #%d\n", "Daniel", 18);
|
||||
|
||||
// not good:
|
||||
printf("Hello, %s! Your ChapelCon submission is #%d\n", 18, "Daniel");
|
||||
```
|
||||
|
||||
Can we define a `printf` function in Chapel that is type-safe?
|
||||
|
||||
---
|
||||
|
||||
# Yet Another Type-Level List
|
||||
|
||||
- The general idea for type-safe `printf`: take the format string, and extract a list of the expected argument types.
|
||||
|
||||
- To make for nicer error messages, include a human-readable description of each type in the list.
|
||||
|
||||
- I've found it more convenient to re-define lists for various problems when needed, rather than having a single canonical list definition.
|
||||
|
||||
```chapel
|
||||
record _nil {
|
||||
proc type length param do return 0;
|
||||
}
|
||||
record _cons {
|
||||
type expectedType; // type of the argument to printf
|
||||
param name: string; // human-readable name of the type
|
||||
type rest;
|
||||
|
||||
proc type length param do return 1 + rest.length();
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Extracting Types from Format Strings
|
||||
|
||||
```Chapel
|
||||
proc specifiers(param s: string, param i: int) type {
|
||||
if i >= s.size then return _nil;
|
||||
|
||||
if s[i] == "%" {
|
||||
if i + 1 >= s.size then
|
||||
compilerError("Invalid format string: unterminted %");
|
||||
|
||||
select s[i + 1] {
|
||||
when "%" do return specifiers(s, i + 2);
|
||||
when "s" do return _cons(string, "a string", specifiers(s, i + 2));
|
||||
when "i" do return _cons(int, "a signed integer", specifiers(s, i + 2));
|
||||
when "u" do return _cons(uint, "an unsigned integer", specifiers(s, i + 2));
|
||||
when "n" do return _cons(numeric, "a numeric value", specifiers(s, i + 2));
|
||||
otherwise do compilerError("Invalid format string: unknown format type");
|
||||
}
|
||||
} else {
|
||||
return specifiers(s, i + 1);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Extracting Types from Format Strings
|
||||
|
||||
Let's give it a quick try:
|
||||
|
||||
```Chapel
|
||||
writeln(specifiers("Hello, %s! Your ChapelCon submission is #%i\n", 0) : string);
|
||||
```
|
||||
|
||||
The above prints:
|
||||
|
||||
```Chapel
|
||||
_cons(string,"a string",_cons(int(64),"a signed integer",_nil))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Validating Argument Types
|
||||
|
||||
* The Chapel standard library has a nice `isSubtype` function that we can use to check if an argument matches the expected type.
|
||||
|
||||
* Suppose the `.length` of our type specifiers matches the number of arguments to `printf`
|
||||
|
||||
* Chapel doesn't currently support empty tuples, so if the lengths match, we know that `specifiers` is non-empty.
|
||||
|
||||
* Then, we can validate the types as follows:
|
||||
```Chapel
|
||||
proc validate(type specifiers: _cons(?t, ?s, ?rest), type argTup, param idx) {
|
||||
if !isSubtype(argTup[idx], t) then
|
||||
compilerError("Argument " + (idx + 1) : string + " should be " + s + " but got " + argTup[idx]:string, idx+2);
|
||||
|
||||
if idx + 1 < argTup.size then
|
||||
validate(rest, argTup, idx + 1);
|
||||
}
|
||||
```
|
||||
|
||||
* The `idx+2` argument to `compilerError` avoids printing the recursive `validate` calls in the error message.
|
||||
|
||||
---
|
||||
|
||||
# The `fprintln` overloads
|
||||
|
||||
* I named it `fprintln` for "formatted print line".
|
||||
|
||||
* To support the empty-specifier case (Chapel varargs don't allow zero arguments):
|
||||
|
||||
```Chapel
|
||||
proc fprintln(param format: string) where specifiers(format, 0).length == 0 {
|
||||
writeln(format);
|
||||
}
|
||||
```
|
||||
* If we do have type specifiers, to ensure our earlier assumption of `size` matching:
|
||||
```Chapel
|
||||
proc fprintln(param format: string, args...)
|
||||
where specifiers(format, 0).length != args.size {
|
||||
compilerError("'fprintln' with this format string expects " +
|
||||
specifiers(format, 0).length : string +
|
||||
" argument(s) but got " + args.size : string);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# The `fprintln` overloads
|
||||
|
||||
* All that's left is the main `fprintln` implementation:
|
||||
|
||||
```Chapel
|
||||
proc fprintln(param format: string, args...) {
|
||||
validate(specifiers(format, 0), args.type, 0);
|
||||
|
||||
writef(format + "\n", (...args));
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Using `fprintln`
|
||||
|
||||
```Chapel
|
||||
fprintln("Hello, world!"); // fine, prints "Hello, world!"
|
||||
fprintln("The answer is %i", 42); // fine, prints "The answer is 42"
|
||||
|
||||
// compiler error: Argument 3 should be a string but got int(64)
|
||||
fprintln("The answer is %i %i %s", 1, 2, 3);
|
||||
```
|
||||
|
||||
More work could be done to support more format specifiers, escapes, etc., but the basic idea is there.
|
||||
|
||||
---
|
||||
|
||||
<!-- _class: lead -->
|
||||
|
||||
# Beyond Lists
|
||||
|
||||
---
|
||||
|
||||
# Beyond Lists
|
||||
|
||||
* I made grand claims earlier
|
||||
- "Write functional-ish program at the type level!"
|
||||
* So far, we've just used lists and some recursion.
|
||||
* Is that all there is?
|
||||
|
||||
---
|
||||
|
||||
# Algebraic Data Types
|
||||
|
||||
* The kinds of data types that Haskell supports are called *algebraic data types*.
|
||||
* At a fundamental level, they can be built up from two operations: _cartesian product_ and _disjoint union_.
|
||||
* There are other concepts to build recursive data types, but we won't need them in Chapel.
|
||||
- To prove to you I know what I'm talking about, some jargon:
|
||||
_initial algebras_, _the fixedpoint functor_, _catamorphisms_...
|
||||
- Check out _Bananas, Lenses, Envelopes and Barbed Wire_ by Meijer et al. for more.
|
||||
* __This matters because, if Chapel has these operations, we can build any data type that Haskell can.__
|
||||
|
||||
---
|
||||
|
||||
<style scoped>
|
||||
li:nth-child(3) { color: lightgrey; }
|
||||
</style>
|
||||
|
||||
# Algebraic Data Types
|
||||
|
||||
- The kinds of data types that Haskell supports are called *algebraic data types*.
|
||||
- At a fundamental level, they can be built up from two operations: _cartesian product_ and _disjoint union_.
|
||||
- There are other concepts to build recursive data types, but we won't need them in Chapel.
|
||||
- To prove to you I know what I'm talking about, some jargon:
|
||||
_initial algebras_, _the fixedpoint functor_, _catamorphisms_...
|
||||
- Check out _Bananas, Lenses, Envelopes and Barbed Wire_ by Meijer et al. for more.
|
||||
- __This matters because, if Chapel has these operations, we can build any data type that Haskell can.__
|
||||
|
||||
---
|
||||
|
||||
# Cartesian Product
|
||||
For any two types, the _cartesian product_ of these two types defines all pairs of values from these types.
|
||||
- This is like a two-element tuple _at the value level_ in Chapel.
|
||||
- We write this as $A \times B$ for two types $A$ and $B$.
|
||||
- In (type-level) Chapel and Haskell:
|
||||
<div class=side-by-side>
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record Pair {
|
||||
type fst;
|
||||
type snd;
|
||||
}
|
||||
|
||||
type myPair = Pair(myVal1, myVal2);
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data Pair = MkPair
|
||||
{ fst :: A
|
||||
, snd :: B
|
||||
}
|
||||
|
||||
myPair = MkPair myVal1 myVal2
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Disjoint Union
|
||||
For any two types, the _disjoint union_ of these two types defines values that are either from one type or the other.
|
||||
- This is _almost_ like a `union` in Chapel or C...
|
||||
- But there's extra information to tell us which of the two types the value is from.
|
||||
- We write this as $A + B$ for two types $A$ and $B$.
|
||||
- In Chapel and Haskell:
|
||||
<div class=side-by-side>
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
|
||||
record InL { type value; }
|
||||
record InR { type value; }
|
||||
|
||||
type myFirstCase = InL(myVal1);
|
||||
type mySecondCase = InR(myVal2);
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data Sum
|
||||
= InL A
|
||||
| InR B
|
||||
|
||||
myFirstCase = InL myVal1
|
||||
mySecondCase = InR myVal2
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Algebraic Data Types
|
||||
|
||||
* We can build up more compelx types by combining these two operations.
|
||||
* Need a triple of types $A$, $B$, and $C$? Use $A \times (B \times C)$.
|
||||
* Similarly, "any one of three types" can be expressed as $A + (B + C)$.
|
||||
* A `Result<T>` type (in Rust, or `optional<T>` in C++) is $T + \text{Unit}$.
|
||||
* `Unit` is a type with a single value (there's only one `None` / `std::nullopt`).
|
||||
|
||||
* Notice that in Chapel, we moved up one level
|
||||
| Thing | Chapel | Haskell |
|
||||
|-------|------------------|-------------------|
|
||||
| `Nil` | type | value |
|
||||
| `Cons`| type constructor | value constructor |
|
||||
| List | **???** | type |
|
||||
|
||||
---
|
||||
|
||||
# Algebraic Data Types
|
||||
|
||||
* Since Chapel has no notion of a type-of-types, we can't enforce that our values are _only_ `InL` or `InR` (in the case of `Sum`).
|
||||
* This is why, in Chapel versions, type annotations like `A` and `B` are missing.
|
||||
<div class=side-by-side>
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record Pair {
|
||||
type fst; /* : A */
|
||||
type snd; /* : B */
|
||||
}
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data Pair = MkPair
|
||||
{ fst :: A
|
||||
, snd :: B
|
||||
}
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
* So, we can't enforce that the user doesn't pass `int` to our `length` function defined on lists.
|
||||
* We also can't enforce that `InL` is instantiated with the right type.
|
||||
* So, we lose some safety compare to Haskell...
|
||||
* ...but we're getting the compiler to do arbitrary computations for us at compile-time.
|
||||
|
||||
---
|
||||
|
||||
# Worked Example: Binary Search Tree
|
||||
|
||||
In Haskell, binary search trees can be defined as follows:
|
||||
|
||||
```Haskell
|
||||
data BSTree = Empty
|
||||
| Node Int BSTree BSTree
|
||||
|
||||
balancedOneTwoThree = Node 2 (Node 1 Empty Empty) (Node 3 Empty Empty)
|
||||
```
|
||||
|
||||
Written using Algebraic Data Types, this is:
|
||||
|
||||
$$
|
||||
\text{BSTree} = \text{Unit} + (\text{Int} \times \text{BSTree} \times \text{BSTree})
|
||||
$$
|
||||
|
||||
In Haskell (using sums and products):
|
||||
|
||||
```Haskell
|
||||
type BSTree' = Unit `Sum` (Int `Pair` (BSTree' `Pair` BSTree'))
|
||||
|
||||
balancedOneTwoThree' = InR (2 `MkPair` (InR (1 `MkPair` (InL MkUnit `MkPair` InL Unit)) `MkPair`
|
||||
InR (3 `MkPair` (InL MkUnit `MkPair` InL Unit))))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Worked Example: Binary Search Tree
|
||||
|
||||
* Recalling the Haskell version:
|
||||
|
||||
```Haskell
|
||||
type BSTree' = Unit `Sum` (Int `Pair` (BSTree' `Pair` BSTree'))
|
||||
|
||||
balancedOneTwoThree' = InR (2 `MkPair` (InR (1 `MkPair` (InL MkUnit `MkPair` InL Unit)) `MkPair`
|
||||
InR (3 `MkPair` (InL MkUnit `MkPair` InL Unit))))
|
||||
```
|
||||
|
||||
* We can't define `BSTree'` in Chapel (no type-of-types), but we can define `balancedOneTwoThree'`:
|
||||
|
||||
```Chapel
|
||||
type balancedOneTwoThree =
|
||||
InR(Pair(2, Pair(InR(Pair(1, Pair(InL(), InL()))),
|
||||
InR(Pair(3, Pair(InL(), InL()))))));
|
||||
```
|
||||
|
||||
* :white_check_mark: We can use algebraic data types to build arbitrarily complex data structures ◼.
|
||||
|
||||
---
|
||||
|
||||
# Returning to Pragmatism
|
||||
|
||||
* We could've defined our list type in terms of `InL`, `InR`, and `Pair`.
|
||||
* However, it was cleaner to make it look more like the non-ADT Haskell version.
|
||||
* Recall that it looked like this:
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record Nil {}
|
||||
record Cons { param head: int; type tail; }
|
||||
|
||||
type myList = Cons(1, Cons(2, Cons(3, Nil)));
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data ListOfInts = Nil
|
||||
| Cons Int ListOfInts
|
||||
|
||||
myList = Cons 1 (Cons 2 (Cons 3 Nil))
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
* We can do the same thing for our binary search tree:
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record Empty {}
|
||||
record Node { param value: int; type left; type right; }
|
||||
|
||||
type balancedOneTwoThree = Node(2, Node(1, Empty, Empty),
|
||||
Node(3, Empty, Empty));
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data BSTree = Empty
|
||||
| Node Int BSTree BSTree
|
||||
|
||||
balancedOneTwoThree = Node 2 (Node 1 Empty Empty)
|
||||
(Node 3 Empty Empty)
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# A General Recipe
|
||||
|
||||
To translate a Haskell data type definition to Chapel:
|
||||
|
||||
* For each constructor, define a `record` with that constructor's name
|
||||
* The fields of that record are `type` fields for each argument of the constructor
|
||||
- If the argument is a value (like `Int`), you can make it a `param` field instead
|
||||
* A visual example, again:
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
record C1 { type arg1; /* ... */ type argi; }
|
||||
// ...
|
||||
record Cn { type arg1; /* ... */ type argj; }
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
data T = C1 arg1 ... argi
|
||||
| ...
|
||||
| Cn arg1 ... argj
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Inserting and Looking Up in a BST
|
||||
|
||||
<div class="side-by-side">
|
||||
<div>
|
||||
|
||||
```Chapel
|
||||
|
||||
proc insert(type t: Empty, param x: int) type do return Node(x, Empty, Empty);
|
||||
proc insert(type t: Node(?v, ?left, ?right), param x: int) type do
|
||||
select true {
|
||||
when x < v do return Node(v, insert(left, x), right);
|
||||
otherwise do return Node(v, left, insert(right, x));
|
||||
}
|
||||
|
||||
type test = insert(insert(insert(Empty, 2), 1), 3);
|
||||
|
||||
|
||||
|
||||
proc lookup(type t: Empty, param x: int) param do return false;
|
||||
proc lookup(type t: Node(?v, ?left, ?right), param x: int) param do
|
||||
select true {
|
||||
when x == v do return true;
|
||||
when x < v do return lookup(left, x);
|
||||
otherwise do return lookup(right, x);
|
||||
}
|
||||
```
|
||||
</div>
|
||||
<div>
|
||||
|
||||
```Haskell
|
||||
insert :: Int -> BSTree -> BSTree
|
||||
insert x Empty = Node x Empty Empty
|
||||
insert x (Node v left right)
|
||||
|
||||
| x < v = Node v (insert x left) right
|
||||
| otherwise = Node v left (insert x right)
|
||||
|
||||
|
||||
test = insert 3 (insert 1 (insert 2 Empty))
|
||||
|
||||
|
||||
lookup :: Int -> BSTree -> Bool
|
||||
lookup x Empty = False
|
||||
lookup x (Node v left right)
|
||||
|
||||
| x == v = True
|
||||
| x < v = lookup x left
|
||||
| otherwise = lookup x right
|
||||
|
||||
```
|
||||
</div>
|
||||
</div>
|
||||
|
||||
It really works!
|
||||
|
||||
```Chapel
|
||||
writeln(test : string);
|
||||
// prints Node(2,Node(1,Empty,Empty),Node(3,Empty,Empty))
|
||||
|
||||
writeln(lookup(test, 1));
|
||||
// prints true for this one, but false for '4'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A Key-Value Map
|
||||
```Chapel
|
||||
record Empty {}
|
||||
record Node { param key: int; param value; type left; type right; }
|
||||
|
||||
proc insert(type t: Empty, param k: int, param v) type do return Node(k, v, Empty, Empty);
|
||||
proc insert(type t: Node(?k, ?v, ?left, ?right), param nk: int, param nv) type do
|
||||
select true {
|
||||
when nk < k do return Node(k, v, insert(left, nk, nv), right);
|
||||
otherwise do return Node(k, v, left, insert(right, nk, nv));
|
||||
}
|
||||
|
||||
proc lookup(type t: Empty, param k: int) param do return "not found";
|
||||
proc lookup(type t: Node(?k, ?v, ?left, ?right), param x: int) param do
|
||||
select true {
|
||||
when x == k do return v;
|
||||
when x < k do return lookup(left, x);
|
||||
otherwise do return lookup(right, x);
|
||||
}
|
||||
|
||||
type test = insert(insert(insert(Empty, 2, "two"), 1, "one"), 3, "three");
|
||||
writeln(lookup(test, 1)); // prints "one"
|
||||
writeln(lookup(test, 3)); // prints "three"
|
||||
writeln(lookup(test, 4)); // prints "not found"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Conclusion
|
||||
|
||||
* Chapel's type-level programming is surprisingly powerful.
|
||||
* We can write compile-time programs that are very similar to Haskell programs.
|
||||
* This allows us to write highly parameterized code without paying runtime overhead.
|
||||
* This also allows us to devise powerful compile-time checks and constraints on our code.
|
||||
* This approach allows for general-purpose programming, which can be applied to `your use-case`
|
Loading…
Reference in New Issue
Block a user