33 Commits

Author SHA1 Message Date
3816599d7a Add once-useful template
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2026-03-22 01:47:49 -05:00
64b2fec8c0 Update .gitignore
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2026-03-22 01:47:25 -05:00
e4e513faf0 Add once-useful lua script
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2026-03-22 01:46:54 -05:00
a8f69a71f0 Add missing HTML file
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2026-03-22 01:32:59 -05:00
9660c0d665 [WIP] Add ability to play sound
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-12-06 17:42:59 -08:00
1e5ed34f0f Start working on a computational workbook for music theory
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-09-21 14:05:15 -07:00
7fbd4ea9f8 Update theme with syntax highlighting fix
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-04-12 17:56:44 -07:00
6fd1e1962b Update theme
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-30 23:15:54 -07:00
62c338e382 Add a post on Chapel's runtime types
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 22:56:10 -08:00
40ea9ec637 Update theme for newer Hugo
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 19:00:05 -08:00
787e194d41 Update theme with new Hugo support
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 13:42:42 -08:00
71162e2db9 Update nokogiri more
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 13:36:08 -08:00
43debc65e4 Update nokogiri
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 13:27:09 -08:00
b07ea85b70 Update ruby scripts to use 'File.exist?'
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-03-02 11:32:55 -08:00
06e8b8e022 Keep KaTeX css files
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 17:01:49 -08:00
647f47a5f3 Remove KaTeX CSS includes if we don't need them.
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 14:43:49 -08:00
36e4feb668 Update theme
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 14:14:15 -08:00
11be991946 Print number of files processed 2025-02-23 13:25:18 -08:00
a8c2b1d05a Fix bug in subsetting script
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 13:16:33 -08:00
fb46142e9d Remove incorrect print
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:45:24 -08:00
0b33d03b73 Pass the font folder as part of argv
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:35:45 -08:00
804147caef Read SVG path from the command line
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:34:44 -08:00
d847d20666 Adjust Python script to also just accept HTML files as args
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:28:19 -08:00
07408d01a9 Use ARGV like other ruby post-processing scripts
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:22:06 -08:00
816a473913 Add a (ChatGPT-provided) script to perform subsetting
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 12:05:02 -08:00
ce8f8fb872 Add a (ChatGPT-provided) script to subset feather icons
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 11:58:41 -08:00
2f60004241 Update theme with instanced fonts
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2025-02-23 11:43:38 -08:00
7130c6bd11 Fix cross-linking in whitespace-trimmed files
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-26 12:35:03 -08:00
c5aacc060a Update theme with font improvements
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-26 11:53:29 -08:00
6048dc0b9c Mark SPA series as completed
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-25 19:07:26 -08:00
1f01c3caff Publish the last two posts in the SPA series
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-25 19:05:28 -08:00
bca44343eb Update theme with required shortcodes
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-25 19:04:10 -08:00
3b9c2edcdd Write up the "verified" portion of the forward analysis
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
2024-12-25 19:03:51 -08:00
28 changed files with 1844 additions and 21 deletions

3
.gitignore vendored
View File

@@ -1 +1,4 @@
**/build/*
/.quarto/
**/*.quarto_ipynb

View File

@@ -3,11 +3,11 @@ GEM
specs:
duktape (2.7.0.0)
execjs (2.9.1)
mini_portile2 (2.8.6)
nokogiri (1.15.6)
mini_portile2 (2.8.8)
nokogiri (1.18.3)
mini_portile2 (~> 2.8.2)
racc (~> 1.4)
racc (1.8.0)
racc (1.8.1)
PLATFORMS
ruby

3
_quarto.yml Normal file
View File

@@ -0,0 +1,3 @@
format:
gfm:
variant: +yaml_metadata_block

12
agda.rb
View File

@@ -23,7 +23,7 @@ class AgdaContext
return @file_infos[file] if @file_infos.include? file
@file_infos[file] = line_infos = {}
unless File.exists?(file)
unless File.exist?(file)
return line_infos
end
@@ -160,6 +160,14 @@ class FileGroup
line_range = 1..
end
# Sometimes, code is deeply nested in the source file, but we don't
# want to show the leading space. In that case, the generator sets
# data-source-offset with how much leading space was stripped off.
initial_offset = 0
if source_offset_attr = t.attribute("data-source-offset")
initial_offset = source_offset_attr.to_s.to_i
end
full_path = t.attribute("data-file-path").to_s
full_path_dirs = Pathname(full_path).each_filename.to_a
@@ -195,7 +203,7 @@ class FileGroup
line_info = agda_info[line_no]
next unless line_info
offset = 0
offset = initial_offset
line.traverse do |lt|
if lt.text?
content = lt.content

View File

@@ -43,7 +43,7 @@ files.each do |file|
tags = []
group = 1
draft = false
next unless File.exists?(file)
next unless File.exist?(file)
value = File.size(file)
url = file.gsub(/^content/, "https://danilafe.com").delete_suffix("/index.md").delete_suffix(".md")
File.readlines(file).each do |l|

View File

@@ -26,7 +26,7 @@ files = ARGV
code_paths = Dir.entries(root_path).select do |f|
File.directory?(File.join(root_path, f)) and f != '.' and f != '..'
end.to_set
code_paths += JSON.parse(File.read(data_file)).keys if File.exists? data_file
code_paths += JSON.parse(File.read(data_file)).keys if File.exist? data_file
# Extending code_paths from submodules.json means that nested Agda modules
# have their root dir correctly set.

View File

@@ -0,0 +1,49 @@
#!/usr/bin/env ruby
# frozen_string_literal: true
require 'nokogiri'
require 'set'
# 1) Process all files passed in from the command line
svgpath = ARGV[0]
files = ARGV[1..]
# 2) Extract used Feather icons
used_icons = Set.new
files.each do |file|
# Parse each HTML file
doc = File.open(file, "r:UTF-8") { |f| Nokogiri::HTML(f) }
# Look for <use xlink:href="/feather-sprite.svg#iconName">
doc.css("use").each do |use_tag|
href = use_tag["xlink:href"] || use_tag["href"]
if href && href.start_with?("/feather-sprite.svg#")
icon_name = href.split("#").last
used_icons << icon_name
end
end
end
puts "Found #{used_icons.size} unique icons: #{used_icons.to_a.join(', ')}"
# 3) Load the full feather-sprite.svg as XML
sprite_doc = File.open(svgpath, "r:UTF-8") { |f| Nokogiri::XML(f) }
# 4) Create a new SVG with only the required symbols
new_svg = Nokogiri::XML::Document.new
svg_tag = Nokogiri::XML::Node.new("svg", new_svg)
svg_tag["xmlns"] = "http://www.w3.org/2000/svg"
new_svg.add_child(svg_tag)
sprite_doc.css("symbol").each do |symbol_node|
if used_icons.include?(symbol_node["id"])
# Duplicate the symbol node (so it can be inserted in the new document)
svg_tag.add_child(symbol_node.dup)
end
end
# 5) Save the subset sprite
File.open(svgpath, "w:UTF-8") do |f|
f.write(new_svg.to_xml)
end

69
chatgpt-subset-one-go.py Normal file
View File

@@ -0,0 +1,69 @@
import os
import sys
from bs4 import BeautifulSoup
from fontTools.subset import Subsetter, Options
from fontTools.ttLib import TTFont
FONT_EXTENSIONS = (".ttf", ".woff", ".woff2", ".otf") # Font file types
def extract_text_from_html(file_path):
"""Extract text content from a single HTML file."""
with open(file_path, "r", encoding="utf-8") as f:
soup = BeautifulSoup(f.read(), "html.parser")
return soup.get_text()
def get_used_characters(files):
"""Collect unique characters from all .html files in the given directory."""
char_set = set()
for file in files:
text = extract_text_from_html(file)
char_set.update(text)
return "".join(sorted(char_set))
def find_font_files(directory):
"""Find all font files in the given directory, recursively."""
font_files = []
for root, _, files in os.walk(directory):
for file in files:
if file.endswith(FONT_EXTENSIONS):
font_files.append(os.path.join(root, file))
return font_files
def subset_font_in_place(font_path, characters):
"""Subsets the given font file to include only the specified characters."""
# Convert characters to their integer code points
unicode_set = {ord(c) for c in characters}
font = TTFont(font_path)
options = Options()
options.drop_tables += ["DSIG"]
options.drop_tables += ["LTSH", "VDMX", "hdmx", "gasp"]
options.unicodes = unicode_set
options.variations = False
options.drop_variations = True
options.layout_features = ["*"] # keep all OT features
options.hinting = False
# Preserve original format if it was WOFF/WOFF2
if font_path.endswith(".woff2"):
options.flavor = "woff2"
elif font_path.endswith(".woff"):
options.flavor = "woff"
subsetter = Subsetter(options)
subsetter.populate(unicodes=unicode_set)
subsetter.subset(font)
# Overwrite the original font file
font.save(font_path)
print(f"Subsetted font in place: {font_path}")
if __name__ == "__main__":
used_chars = get_used_characters(sys.argv[2:])
print(f"Extracted {len(used_chars)} unique characters from {len(sys.argv[2:])} HTML files.")
font_files = find_font_files(sys.argv[1])
print(f"Found {len(font_files)} font files to subset.")
for font_file in font_files:
subset_font_in_place(font_file, used_chars)

View File

@@ -339,9 +339,8 @@ this means the rule applies to (object) variables declared to have type
our system. A single rule takes care of figuring the types of _all_
variables.
{{< todo >}}
The rest of this, but mostly statements.
{{< /todo >}}
> [!TODO]
> The rest of this, but mostly statements.
### This Page at a Glance
#### Metavariables

View File

@@ -55,6 +55,7 @@ Now, let's start with the least element of our lattice, denoted \(\bot\).
A lattice of finite height is guaranteed to have such an element. If it didn't,
we could always extend chains by tacking on a smaller element to their bottom,
and then the lattice wouldn't have a finite height anymore.
{#start-least}
Now, apply \(f\) to \(\bot\) to get \(f(\bot)\). Since \(\bot\) is the least
element, it must be true that \(\bot \le f(\bot)\). Now, if it's "less than or equal",

View File

@@ -262,6 +262,7 @@ expressions, and the letter \(v\) to stand for values. Finally, we'll write
\(\rho, e \Downarrow v\) to say that "in an environment \(\rho\), expression \(e\)
evaluates to value \(v\)". Our two previous examples of evaluating `x+1` can
thus be written as follows:
{#notation-for-environments}
{{< latex >}}
\{ \texttt{x} \mapsto 42 \}, \texttt{x}+1 \Downarrow 43 \\

View File

@@ -4,13 +4,26 @@ series: "Static Program Analysis in Agda"
description: "In this post, I use the monotone lattice framework and verified CFGs to define a sign analysis"
date: 2024-12-01T15:09:07-08:00
tags: ["Agda", "Programming Languages"]
draft: true
---
In the previous post, I showed that the Control Flow graphs we built of our
programs match how they are really executed. This means that we can rely
on these graphs to compute program information. In this post, we finally
get to compute that information. Let's jump right into it!
get to compute that information. Here's a quick bit paraphrasing from last time
that provides a summary of our approach:
1. We will construct a finite-height lattice. Every single element of this
lattice will contain information about each variable at each node in the
Control Flow Graph.
2. We will then define a monotonic function that update this information using
the structure encoded in the CFGs edges and nodes.
3. Then, using the fixed-point algorithm, we will find the least element of the
lattice, which will give us a precise description of all program variables at
all points in the program.
4. Because we have just validated our CFGs to be faithful to the languages
semantics, well be able to prove that our algorithm produces accurate results.
Let's jump right into it!
### Choosing a Lattice
A lot of this time, we have been [talking about lattices]({{< relref "01_spa_agda_lattices" >}}),
@@ -59,6 +72,7 @@ split our programs into sequences of statements that are guaranteed to execute
together --- basic blocks. For our analysis, we'll keep per-variable for
each basic block in the program. Since basic blocks are nodes in the Control Flow
Graph of our program, our whole lattice will be as follows:
{#whole-lattice}
{{< latex >}}
\text{Info} \triangleq \text{NodeId} \to (\text{Variable} \to \text{Sign})
@@ -112,6 +126,7 @@ exact values of `x`, `y`, and `z`, leaving only their signs). The exact details
of how this partial evaluation is done are analysis-specific; in general, we
simply require an analysis to provide an evaluator. We will define
[an evaluator for the sign lattice below](#instantiating-with-the-sign-lattice).
{#general-evaluator}
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 166 167 >}}
@@ -136,6 +151,7 @@ current block. Early on, we saw that [the \((\sqcup)\) operator models disjuncti
\((\sqcup)\) to the variable-sign maps of all predecessors. The
[reference _Static Program Analysis_ text](https://cs.au.dk/~amoeller/spa/)
calls this operation \(\text{JOIN}\):
{#join-preds}
{{< latex >}}
\textit{JOIN}(v) = \bigsqcup_{w \in \textit{pred}(v)} \llbracket w \rrbracket
@@ -272,6 +288,7 @@ Actually, we haven't yet seen that `updateVariablesFromStmt`. This is
a function that we can define using the user-provided abtract interpretation
`eval`. Specifically, it handles the job of updating the sign of a variable
once it has been assigned to (or doing nothing if the statement is a no-op).
{#define-updateVariablesFromStmt}
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 191 193 >}}

View File

@@ -0,0 +1,547 @@
---
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 9: Verifying the Forward Analysis"
series: "Static Program Analysis in Agda"
description: "In this post, I prove that the sign analysis from the previous is correct"
date: 2024-12-25T19:00:00-08:00
tags: ["Agda", "Programming Languages"]
left_align_code: true
---
In the previous post, we put together a number of powerful pieces of machinery
to construct a sign analysis. However, we still haven't verified that this
analysis produces correct results. For the most part, we already have the
tools required to demonstrate correctness; the most important one
is the [validity of our CFGs]({{< relref "07_spa_agda_semantics_and_cfg" >}})
relative to [the semantics of the little language]({{< relref "05_spa_agda_semantics" >}}).
### High-Level Algorithm
We'll keep working with the sign lattice as an example, keeping in mind
how what we do generalizes to a any lattice \(L\) describing a variable's
state. The general shape of our argument will be as follows, where I've underlined and
numbered assumptions or aspects that we have yet to provide.
1. Our fixed-point analysis from the previous section gave us a result \(r\)
that satisfies the following equation:
{{< latex >}}
r = \text{update}(\text{join}(r))
{{< /latex >}}
Above \(\text{join}\) applies the [predecessor-combining function]({{< relref "08_spa_agda_forward#join-preds" >}})
from the previous post to each state (corresponding to `joinAll` in Agda)
and \(\text{update}\) performs one round of abstract interpretation.
2. Because of the [correspondence of our semantics and CFGs]({{< relref "07_spa_agda_semantics_and_cfg" >}}),
each program evaluation in the form \(\rho, s \Rightarrow \rho'\)
corresponds to a path through the Control Flow Graph. Along the path,
each node contains simple statements, which correspond to intermediate steps
in evaluating the program. These will also be in the form
\(\rho_1, b \Rightarrow \rho_2\).
3. We will proceed iteratively, stepping through the trace one basic block at
a time. At each node in the graph:
* We will assume that the beginning state (the variables in \(\rho_1\)) are
{{< internal "correctly-described" >}}
correctly described
{{< /internal >}}
by one of the predecessors of the current node. Since
{{< internal "disjunction" >}}
joining represents "or"
{{< /internal >}},
that is the same
as saying that \(\text{join}(r)\)
contains an accurate description of \(\rho_1\).
* Because
{{< internal "abstract-interpretation" >}}
the abstract interpretation function preserves accurate descriptions
{{< /internal >}},
if \(\text{join}(r)\) contains an accurate description \(\rho_1\), then applying our
abstract interpretation function via \(\text{update}\) should result in
a map that contains an accurate-described \(\rho_2\). In other words, \(\text{update}(\text{join}(r))\)
describes \(\rho_2\) at the current block.
{{< internal "equivalence" >}}
By the equation above
{{< /internal >}}, that's the same as saying
\(r\) describes \(\rho_2\) at the current block.
* Since the trace is a path through a graph, there must be an edge from
the current basic block to the next. This means that the current basic
block is a predecessor of the next one. From the previous point, we know
that \(\rho_2\) is accurately described by this predecessor, fulfilling
our earlier assumption and allowing us to continue iteration.
So, what are the missing pieces?
1. We need to define what it means for a lattice (like our sign lattice)
to "correctly describe" what happens when evaluating a program for real.
For example, the \(+\) in sign analysis describes values that are bigger than zero,
and a map like `{x:+}` states that `x` can only take on positive values.
2. We've seen before [the \((\sqcup)\) operator models disjunction
("A or B")]({{< relref "01_spa_agda_lattices#lub-glub-or-and" >}}), but
that was only an informal observation; we'll need to specify it preceisely.
3. Each analysis [provides an abstract interpretation `eval` function]({{< relref "08_spa_agda_forward#general-evaluator" >}}).
However, until now, nothing has formally constrained this function; we could
return \(+\) in every case, even though that would not be accurate. We will
need, for each analysis, a proof that its `eval` preserves accurate descriptions.
4. The equalities between our lattice elements [are actually equivalences]({{< relref "01_spa_agda_lattices#definitional-equality" >}}),
which helps us use simpler representations of data structures. Thus, even
in statements of the fixed point algorithm, our final result is a value \(a\)
such that \(a \approx f(a)\). We need to prove that our notion of equivalent
lattice elements plays nicely with correctness.
Let's start with the first bullet point.
### A Formal Definition of Correctness
When a variable is mapped to a particular sign (like `{ "x": + }`),
what that really says is that the value of `x` is greater than zero. Recalling
from [the post about our language's semantics]({{< relref "05_spa_agda_semantics#notation-for-environments" >}})
that we use the symbol \(\rho\) to represent mappings of variables to
their values, we might write this claim as:
{{< latex >}}
\rho(\texttt{x}) > 0
{{< /latex >}}
This is a good start, but it's a little awkward defining the meaning of "plus"
by referring to the context in which it's used (the `{ "x": ... }` portion
of the expression above). Instead, let's associate with each sign (like \(+\)) a
predicate: a function that takes a value, and makes a claim about that value
("this is positive"):
{{< latex >}}
\llbracket + \rrbracket\ v = v > 0
{{< /latex >}}
The notation above is a little weird unless you, like me, have a background in
Programming Language Theory (❤️). This comes from [denotational semantics](https://en.wikipedia.org/wiki/Denotational_semantics);
generally, one writes:
{{< latex >}}
\llbracket \text{thing} \rrbracket = \text{the meaning of the thing}
{{< /latex >}}
Where \(\llbracket \cdot \rrbracket\) is really a function (we call it
the _semantic function_) that maps things to
their meaning. Then, the above equation is similar to the more familiar
\(f(x) = x+1\): function and arguments on the left, definition on the right. When
the "meaning of the thing" is itself a function, we could write it explicitly
using lambda-notation:
{{< latex >}}
\llbracket \text{thing} \rrbracket = \lambda x.\ \text{body of the function}
{{< /latex >}}
Or, we could use the Haskell style and write the new variable on the left of
the equality:
{{< latex >}}
\llbracket \text{thing} \rrbracket\ x = \text{body of the function}
{{< /latex >}}
That is precisely what I'm doing above with \(\llbracket + \rrbracket\).
With this in mind, we could define the entire semantic function for the
sign lattice as follows:
{{< latex >}}
\llbracket + \rrbracket\ v = v\ \texttt{>}\ 0 \\
\llbracket 0 \rrbracket\ v = v\ \texttt{=}\ 0 \\
\llbracket - \rrbracket\ v = v\ \texttt{<}\ 0 \\
\llbracket \top \rrbracket\ v = \text{true} \\
\llbracket \bot \rrbracket\ v = \text{false}
{{< /latex >}}
In Agda, the integer type already distinguishes between "negative natural" or
"positive natural" cases, which made it possible to define the semantic function
{{< sidenote "right" "without-note" "without using inequalities." >}}
Reasoning about inequalities is painful, sometimes requiring a number of
lemmas to arrive at a result that is intuitively obvious. Coq has a powerful
tactic called <a href="https://coq.inria.fr/doc/v8.11/refman/addendum/micromega.html#coq:tacn.lia"><code>lia</code></a>
that automatically solves systems of inequalities, and I use it liberally.
However, lacking such a tactic in Agda, I would like to avoid inequalities
if they are not needed.
{{< /sidenote >}}
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 114 119 >}}
Notably, \(\llbracket \top \rrbracket\ v\) always holds, and
\(\llbracket \bot \rrbracket\ v\) never does. __In general__, we will always
need to define a semantic function for whatever lattice we are choosing for
our analysis.
It's important to remember from the previous post that the sign lattice
(or, more generally, our lattice \(L\)) is only a component of the
[lattice we use to instantiate the analysis]({{< relref "08_spa_agda_forward#whole-lattice" >}}).
We at least need to define what it means for the \(\text{Variable} \to \text{Sign}\)
portion of that lattice to be correct. This way, we'll have correctness
criteria for each key (CFG node) in the top-level \(\text{Info}\) lattice.
Since a map from variables to their sign characterizes not a single value \(v\)
but a whole environment \(\rho\), something like this is a good start:
{{< latex >}}
\llbracket \texttt{\{} x_1: s_1, ..., x_n: s_n \texttt{\}} \rrbracket\ \rho = \llbracket s_1 \rrbracket\ \rho(x_1)\ \text{and}\ ...\ \text{and}\ \llbracket s_n \rrbracket\ \rho(x_n)
{{< /latex >}}
As a concrete example, we might get:
{{< latex >}}
\llbracket \texttt{\{} \texttt{x}: +, \texttt{y}: - \texttt{\}} \rrbracket\ \rho = \rho(\texttt{x})\ \texttt{>}\ 0\ \text{and}\ \rho(\texttt{y})\ \texttt{<}\ 0
{{< /latex >}}
This is pretty good, but not quite right. For instance, the initial state of
the program --- before running the analysis --- assigns \(\bot\) to each
element. This is true because our fixed-point algorithm [starts with the least
element of the lattice]({{< relref "04_spa_agda_fixedpoint#start-least" >}}).
But even for a single-variable map `{x: ⊥ }`, the semantic function above would
give:
{{< latex >}}
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \text{false}
{{< /latex >}}
That's clearly not right: our initial state should be possible, lest
the entire proof be just a convoluted [_ex falso_](https://en.wikipedia.org/wiki/Principle_of_explosion)!
There is another tricky aspect of our analysis, which is primarily defined
[using the join (\(\sqcup\)) operator]({{< relref "08_spa_agda_forward#join-preds" >}}).
Observe the following example:
```C
// initial state: { x: ⊥ }
if b {
x = 1; // state: { x: + }
} else {
// state unchanged: { x: ⊥ }
}
// state: { x: + } ⊔ { x: ⊥ } = { x: + }
```
Notice that in the final state, the sign of `x` is `+`, even though when
`b` is `false`, the variable is never set. In a simple language like ours,
without variable declaration points, this is probably the best we could hope
for. The crucial observation, though, is that the oddness only comes into
play with variables that are not set. In the "initial state" case, none
of the variables have been modified; in the `else` case of the conditional,
`x` was never assigned to. We can thus relax our condition to an if-then:
if a variable is in our environment \(\rho\), then the variable-sign lattice's
interpretation accurately describes it.
{{< latex >}}
\begin{array}{ccc}
\llbracket \texttt{\{} x_1: s_1, ..., x_n: s_n \texttt{\}} \rrbracket\ \rho & = & & \textbf{if}\ x_1 \in \rho\ \textbf{then}\ \llbracket s_1 \rrbracket\ \rho(x_1)\ \\ & & \text{and} & ... \\ & & \text{and} & \textbf{if}\ x_n \in \rho\ \textbf{then}\ \llbracket s_n \rrbracket\ \rho(x_n)
\end{array}
{{< /latex >}}
The first "weird" case now results in the following:
{{< latex >}}
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \textbf{if}\ \texttt{x} \in \rho\ \textbf{then}\ \text{false}
{{< /latex >}}
Which is just another way of saying:
{{< latex >}}
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \texttt{x} \notin \rho
{{< /latex >}}
In the second case, the interpretation also results in a true statement:
{{< latex >}}
\llbracket \texttt{\{} \texttt{x}: + \texttt{\}} \rrbracket\ \rho = \textbf{if}\ \texttt{x} \in \rho\ \textbf{then}\ \texttt{x} > 0
{{< /latex >}}
In Agda, I encode the fact that a verified analysis needs a semantic function
\(\llbracket\cdot\rrbracket\) for its element lattice \(L\) by taking such
a function as an argument called `⟦_⟧ˡ`:
{{< codelines "Agda" "agda-spa/Analysis/Forward.agda" 246 253 "hl_lines=5" >}}
I then define the semantic function for the variable-sign lattice in the following
way, which eschews the "..." notation in favor of a more Agda-compatible (and
equivalent) form:
{{< codelines "Agda" "agda-spa/Analysis/Forward.agda" 255 256 >}}
The above reads roughly as follows:
> For every variable `k` and sign [or, more generally, lattice element] `l` in
> the variable map lattice, if `k` is in the environment `ρ`, then it satisfies
> the predicate given by the semantic function applied to `l`.
Let's recap: we have defined a semantic function for our sign lattice, and
noted that to define a verified analysis, we always need such a semantic function.
We then showed how to construct a semantic function for a whole variable map
(of type \(\text{Variable} \to \text{Sign}\), or \(\text{Variable}\to L\)
in general). We also wrote some Agda code doing all this. As a result, we
have filled in the missing piece for {{< internalref "correctly-described" >}}property{{< /internalref >}}.
However, the way that we brought in the semantic function in the Agda code
above hints that there's more to be discussed. What's `latticeInterpretationˡ`?
In answering that question, we'll provide evidence for
{{< internalref "disjunction" >}}property{{< /internalref >}}
and
{{< internalref "equivalence" >}}property{{< /internalref >}}.
### Properties of the Semantic Function
As we briefly saw earlier, we loosened the notion of equality to that equivalences,
which made it possible to ignore things like the ordering of key-value pairs
in maps. That's great and all, but nothing is stopping us from defining semantic functions that violate our equivalence!
Supposing \(a \approx f(a)\), as far
as Agda is concerned, even though \(a\) and \(f(a)\) are "equivalent",
\(\llbracket a \rrbracket\) and \(\llbracket f(a) \rrbracket\) may be
totally different. For a semantic function to be correct, it must produce
the same predicate for equivalent elements of lattice \(L\). That's
{{< internalref "equivalence" >}}missing piece{{< /internalref >}}.
Another property of semantic functions that we will need to formalize
is that \((\sqcup)\) represents disjunction.
This comes into play when we reason about the correctness of predecessors in
a Control Flow Graph. Recall that during the last step of processing a given node,
when we are trying to move on to the next node in the trace, we have knowledge
that the current node's variable map accurately describes the intermediate
environment. In other words, \(\llbracket \textit{vs}_i \rrbracket\ \rho_2\) holds, where
\(\textit{vs}_i\) is the variable map for the current node. We can generalize this
kowledge a little, and get:
{{< latex >}}
\llbracket \textit{vs}_1 \rrbracket\ \rho_2\ \text{or}\ ...\ \text{or}\ \llbracket \textit{vs}_n \rrbracket\ \rho_2
{{< /latex >}}
However, the assumption that we _need_ to hold when moving on to a new node
is in terms of \(\textit{JOIN}\), which combines all the predecessors' maps
\(\textit{vs}_1, ..., \textit{vs}_n\) using \((\sqcup)\). Thus, we will need to be in a world where
the following claim is true:
{{< latex >}}
\llbracket \textit{vs}_1 \sqcup ... \sqcup \textit{vs}_n \rrbracket\ \rho
{{< /latex >}}
To get from one to the other, we will need to rely explicitly on the fact
that \((\sqcup)\) encodes "or". It's not necessary for the forward analysis,
but a similar property ought to hold for \((\sqcap)\) and "and". This
constraint provides {{< internalref "disjunction" >}}missing piece{{< /internalref >}}.
I defined a new data type that bundles a semantic function with proofs of
the properties in this section; that's precisely what `latticeInterpretationˡ`
is:
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 66 73 >}}
In short, to leverage the framework for verified analysis, you would need to
provide a semantic function that interacts properly with `≈` and ``.
### Correctness of the Evaluator
All that's left is {{< internalref "abstract-interpretation" >}}the last missing piece, {{< /internalref >}},
which requires that `eval` matches the semantics of our language. Recall
the signature of `eval`:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 166 166 >}}
It operates on expressions and variable maps, which themselves associate a
sign (or, generally, an element of lattice \(L\)), with each variable. The
"real" evaluation judgement, on the other hand, is in the form
\(\rho, e \Downarrow v\), and reads "expression \(e\) in environment \(\rho\)
evaluates to value \(v\)". In Agda:
{{< codelines "agda" "agda-spa/Language/Semantics.agda" 27 27 >}}
Let's line up the types of `eval` and the judgement. I'll swap the order of arguments
for `eval` to make the correspondence easier to see:
{{< latex >}}
\begin{array}{ccccccc}
\text{eval} & : & (\text{Variable} \to \text{Sign}) & \to & \text{Expr} & \to & \text{Sign} \\
\cdot,\cdot\Downarrow\cdot & : & (\text{Variable} \to \text{Value}) & \to & \text{Expr} & \to & \text{Value} & \to & \text{Set} \\
& & \underbrace{\phantom{(\text{Variable} \to \text{Value})}}_{\text{environment-like inputs}} & & & & \underbrace{\phantom{Value}}_{\text{value-like outputs}}
\end{array}
{{< /latex >}}
Squinting a little, it's almost like the signature of `eval` is the signature
for the evaluation judgement, but it forgets a few details (the exact values
of the variables) in favor of abstractions (their signs). To show that `eval`
behaves correctly, we'll want to prove that this forgetful correspondence holds.
Concretely, for any expression \(e\), take some environment \(\rho\), and "forget"
the exact values, getting a sign map \(\textit{vs}\). Now, evaluate the expression
to some value \(v\) using the semantics, and also, compute the expression's
expected sign \(s\) using `eval`. The sign should be the same as forgetting
\(v\)'s exact value. Mathematically,
{{< latex >}}
\forall e, \rho, v, \textit{vs}.\ \textbf{if}\ \llbracket\textit{vs}\rrbracket \rho\ \text{and}\ \rho, e \Downarrow v\ \textbf{then}\ \llbracket \text{eval}\ \textit{vs}\ e\rrbracket v
{{< /latex >}}
In Agda:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 286 287 >}}
For a concrete analysis, we need to prove the above claim. In the case of
sign analysis, this boils down to a rather cumbersome proof by cases. I will collapse
the proofs to save some space and avoid overwhelming the reader.
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 237 258 "" "**(Click here to expand the proof of correctness for plus)**" >}}
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 261 282 "" "**(Click here to expand the proof of correctness for minus)**" >}}
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 284 294 "" >}}
This completes {{< internalref "abstract-interpretation" >}}our last missing piece,{{< /internalref >}}.
All that's left is to put everything together.
### Proving The Analysis Correct
#### Lifting Expression Evaluation Correctness to Statements
The individual analyses (like the sign analysis) provide only an evaluation
function for _expressions_, and thus only have to prove correctness of
that function. However, our language is made up of statements, with judgements
in the form \(\rho, s \Rightarrow \rho'\). Now that we've shown (or assumed)
that `eval` behaves correctly when evaluating expressions, we should show
that this correctness extends to evaluating statements, which in the
forward analysis implementation is handled by the
[`updateVariablesFromStmt` function]({{< relref "08_spa_agda_forward#define-updateVariablesFromStmt" >}}).
The property we need to show looks very similar to the property for `eval`:
{{< latex >}}
\forall b, \rho, \rho', \textit{vs}.\ \textbf{if}\ \llbracket\textit{vs}\rrbracket \rho\ \text{and}\ \rho, b \Rightarrow \rho'\ \textbf{then}\ \llbracket \text{updateVariablesFromStmt}\ \textit{vs}\ b\rrbracket \rho'
{{< /latex >}}
In Agda:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 291 291 >}}
The proof is straightforward, and relies on the semantics of the [map update]({{< relref "08_spa_agda_forward#generalized-update" >}}).
Specifically, in the case of an assignment statement \(x \leftarrow e\), all we
do is store the new sign computed from \(e\) into the map at \(x\). To
prove the correctness of the entire final environment \(\rho'\), there are
two cases to consider:
* A variable in question is the newly-updated \(x\). In this case, since
`eval` produces correct signs, the variable clearly has the correct sign.
This is the first highlighted chunk in the below code.
* A variable in question is different from \(x\). In this case, its value
in the environment \(\rho'\) should be the same as it was prior, and
its sign in the updated variable map is the same as it was in the original.
Since the original map correctly described the original environment, we know
the sign is correct. This is the second highlighted chunk in the below
code.
The corresponding Agda proof is as follows:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 291 305 "hl_lines=5-7 10-15" >}}
From this, it follows with relative ease that each basic block in the lattice,
when evaluated, produces an environment that matches the prediction of our
forward analysis.
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 318 318 >}}
#### Walking the Trace
Finally, we get to the meat of the proof, which follows the [outline](#high-level-algorithm). First,
let's take a look at `stepTrace`, which implements the second bullet in
our iterative procedure. I'll show the code, then we can discuss it
in detail.
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 324 342 >}}
The first `let`-bound variable, `⟦joinAll-result⟧ρ₁` is kind of an intermediate
result, which I was forced to introduced because `rewrite` caused Agda to
allocate ~100GB of memory. It simply makes use of the fact that `joinAll`, the
function that performs predecessor joining for each node in the CFG, sets
every key of the map accordingly.
The second `let`-bound variable, `⟦analyze-result⟧`, steps through a given
node's basic block and leverages our proof of statement-correctness to validate
that the final environment `ρ₂` matches the predication of the analyzer.
The last two `let`-bound variables apply the equation we wrote above:
{{< latex >}}
r = \text{update}(\text{join}(r))
{{< /latex >}}
Recall that `analyze` is the combination of `update` and `join`:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 226 227 >}}
Finally, the `in` portion of the code uses `⟦⟧ᵛ-respects-≈ᵛ`, a proof
of {{< internalref "equivalence" >}}property{{< /internalref >}}, to produce
the final claim in terms of the `result` map.
Knowing how to step, we can finally walk the entire trace, implementing
the iterative process:
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 344 357 >}}
The first step --- assuming that one of the predecessors of the
current node satisfies the initial environment `ρ₁` --- is captured by
the presence of the argument `⟦joinForKey-s₁⟧ρ₁`. We expect the calling code
to provide a proof of that.
The second step, in both cases, is implemented using `stepTrace`,
as we saw above. That results in a proof that at the end of the current basic
block, the final environment `ρ₂` is accurately described.
From there, we move on to the third iterative step, if necessary. The
sub-expression `edge⇒incoming s₁→s₂` validates that, since we have an edge
from the current node to the next, we are listed as a predecessor. This,
in turn, means that we are included in the list of states-to-join for the
\(\textit{JOIN}\) function. That fact is stored in `s₁∈incomingStates`.
Finally, relying on
{{< internalref "disjunction" >}}property{{< /internalref >}},
we construct an assumption fit for a recursive invocation of `walkTrace`,
and move on to the next CFG node. The `foldr` here is motivated by the fact
that "summation" using \((\sqcup)\) is a fold.
When the function terminates, what we have is a proof that the final program
state is accurately described by the results of our program analysis. All
that's left is to kick off the walk. To do that, observe that the initial state
has no predecessors (how could it, if it's at the beginning of the program?).
That, in turn, means that this state maps every variable to the bottom element.
Such a variable configuration only permits the empty environment \(\rho = \varnothing\).
If the program evaluation starts in an empty environment, we have the assumption
needed to kick off the iteration.
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 359 366 "hl_lines=7" >}}
Take a look at the highlighted line in the above code block in particular.
It states precisely what we were hoping to see: that, when evaluating
a program, the final state when it terminates is accurately described by
the `result` of our static program analysis at the `finalState` in the CFG.
We have done it!
### Future Work
It took a lot of machinery to get where we are, but there's still lots of
things to do.
1. __Correctness beyond the final state__: the statement we've arrived at
only shows that the final state of the program matches the results of
the analysis. In fact, the property hold for all intermediate states, too.
The only snag is that it's more difficult to _state_ such a claim.
To do something like that, we probably need a notion of "incomplete evaluations"
of our language, which run our program but stop at some point before the end.
A full execution would be a special case of such an "incomplete evaluation"
that stops in the final state. Then, we could restate `analyze-correct`
in terms of partial evaluations, which would strengthen it.
2. __A more robust language and evaluation process__: we noted above that
our join-based analysis is a little bit weird, particularly in the
cases of uninitialized variables. There are ways to adjust our language
(e.g., introducing variable declaration points) and analysis functions
(e.g., only allowing assignment for declared variables) to reduce
the weirdness somewhat. They just lead to a more complicated language.
3. __A more general correctness condition__: converting lattice elements into
predicates on values gets us far. However, some types of analyses make claims
about more than the _current_ values of variables. For instance, _live variable
analysis_ checks if a variable's current value is going to be used in the
future. Such an analysis can help guide register (re)allocation. To
talk about future uses of a variable, the predicate will need to be formulated
in terms of the entire evaluation proof tree. This opens a whole can
of worms that I haven't begun to examine.
Now that I'm done writing up my code so far, I will start exploring these
various avenues of work. In the meantime, though, thanks for reading!

View File

@@ -0,0 +1,591 @@
---
title: "Chapel's Runtime Types as an Interesting Alternative to Dependent Types"
date: 2025-03-02T22:52:01-08:00
tags: ["Chapel", "C++", "Idris", "Programming Languages"]
description: "In this post, I discuss Chapel's runtime types as a limited alternative to dependent types."
---
One day, when I was in graduate school, the Programming Languages research
group was in a pub for a little gathering. Amidst beers, fries, and overpriced
sandwiches, the professor and I were talking about [dependent types](https://en.wikipedia.org/wiki/Dependent_type). Speaking
loosely and imprecisely, these are types that are somehow constructed from
_values_ in a language, like numbers.
For example, in C++, [`std::array`](https://en.cppreference.com/w/cpp/container/array)
is a dependent type. An instantiation of the _type_ `array`, like `array<string, 3>`
is constructed from the type of its elements (here, `string`) and a value
representing the number of elements (here, `3`). This is in contrast with types
like `std::vector`, which only depends on a type (e.g., `vector<string>` would
be a dynamically-sized collection of strings).
I was extolling the virtues of general dependent types, like you might find
in [Idris](https://www.idris-lang.org/) or [Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html):
more precise function signatures! The
{{< sidenote "right" "curry-howard-note" "Curry-Howard isomorphism!" >}}
The Curry-Howard isomorphism is a common theme on this blog. I've
<a href="{{< relref "typesafe_interpreter_revisited#curry-howard-correspondence" >}}">
written about it myself</a>, but you can also take a look at the
<a href="https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence">
Wikipedia page</a>.
{{< /sidenote >}} The professor was skeptical. He had been excited about
dependent types in the past, but nowadays he felt over them. They were cool, he
said, but there are few practical uses. In fact, he posed a challenge:
> Give me one good reason to use dependent types in practice that doesn't
> involve keeping track of bounds for lists and matrices!
{#bounds-quote}
This challenge alludes to fixed-length lists -- [vectors](https://agda.github.io/agda-stdlib/master/Data.Vec.html)
-- which are one of the first dependently-typed data structures one learns about.
Matrices are effectively vectors-of-vectors. In fact, even in giving my introductory
example above, I demonstrated the C++ equivalent of a fixed-length list, retroactively
supporting the professor's point.
It's not particularly important to write down how I addressed the challenge;
suffice it to say that the notion resonated with some of the other
students present in the pub. In the midst of practical development, how much
of dependent types' power can you leverage, and how much power do you pay
for but never use?
A second round of beers arrived. The argument was left largely unresolved,
and conversation flowed to other topics. Eventually, I graduated, and started
working on the [Chapel language](https://chapel-lang.org/) team (I also
[write on the team's blog](https://chapel-lang.org/blog/authors/daniel-fedorin/)).
When I started looking at Chapel programs, I could not believe my eyes...
### A Taste of Chapel's Array Types
Here's a simple Chapel program that creates an array of 10 integers.
```Chapel
var A: [0..9] int;
```
Do you see the similarity to the `std::array` example above? Of course, the
syntax is quite different, but in _essence_ I think the resemblance is
uncanny. Let's mangle the type a bit --- producing invalid Chapel programs ---
just for the sake of demonstration.
```Chapel
var B: array(0..9, int); // first, strip the syntax sugar
var C: array(int, 0..9); // swap the order of the arguments to match C++
```
Only one difference remains: in C++, arrays are always indexed from zero. Thus,
writing `array<int, 10>` would implicitly create an array whose indices start
with `0` and end in `9`. In Chapel, array indices can start at values other
than zero (it happens to be useful for elegantly writing numerical programs),
so the type explicitly specifies a lower and a higher bound. Other than that,
though, the two types look very similar.
In general, Chapel arrays have a _domain_, typically stored in variables like `D`.
The domain of `A` above is `{0..9}`. This domain is part of the array's type.
Before I move on, I'd like to pause and state a premise that is crucial
for the rest of this post: __I think knowing the size of a data structure,
like `std::array` or Chapel's `[0..9] int`, is valuable__. If this premise
were not true, there'd be no reason to prefer `std::array` to `std::vector`, or
care that Chapel has indexed arrays. However, having this information
can help in numerous ways, such as:
* __Enforcing compatible array shapes.__ For instance, the following Chapel
code would require two arrays passed to function `foo` to have the same size.
```Chapel
proc doSomething(people: [?D] person, data: [D] personInfo) {}
```
Similarly, we can enforce the fact that an input to a function has the same shape
as the output:
```Chapel
proc transform(input: [?D] int): [D] string;
```
* __Consistency in generics__. Suppose you have a generic function that declares
a new variable of a given type, and just returns it:
```Chapel
proc defaultValue(type argType) {
var x: argType;
return x;
}
```
Code like this exists in "real" Chapel software, by the way --- the example
is not contrived. By including the bounds etc. into the array type, we can
ensure that `x` is appropriately allocated. Then, `defaultValue([1,2,3].type)`
would return an array of three default-initialized integers.
* __Eliding boundary checking__. Boundary checking is useful for safety,
since it ensures that programs don't read or write past the end of allocated
memory. However, bounds checking is also slow. Consider the following function that
sums two arrays:
```Chapel
proc sumElementwise(A: [?D] int, B: [D] int) {
var C: [D] int;
for idx in D do
C[idx] = A[idx] + B[idx];
}
```
Since arrays `A`, `B`, and `C` have the same domain `D`, we don't need
to do bound checking when accessing any of their elements. I don't believe
this is currently an optimisation in Chapel, but it's certainly on the
table.
* __Documentation__. Including the size of the array as part of type
signature clarifies the intent of the code being written. For instance,
in the following function:
```Chapel
proc sendEmails(numEmails: int, destinationAddrs: [1..numEmails] address) { /* ... */ }
```
It's clear from the type of the `destinationAddrs`s that there ought to
be exactly as many `destinationAddrs` as the number of emails that should
be sent.
Okay, recap: C++ has `std::array`, which is a dependently-typed container
that represents an array with a fixed number of elements. Chapel has something
similar. I think these types are valuable.
At this point, it sort of looks like I'm impressed with Chapel for copying a C++
feature from 2011. Not so! As I played with Chapel programs more and more,
arrays miraculously supported patterns that I knew I couldn't write in C++.
The underlying foundation of Chapel's array types is quite unlike any other.
Before we get to that, though, let's take a look at how dependent types
are normally used (by us mere mortal software engineers).
### Difficulties with Dependent Types
Let's start by looking at a simple operation on fixed-length lists: reversing them.
One might write a reverse function for "regular" lists, ignoring details
like ownership, copying, that looks like this:
```C++
std::vector<int> reverse(std::vector<int>);
```
This function is not general: it won't help us reverse lists of
strings, for instance. The "easy fix" is to replace `int` with some kind
of placeholder that can be replaced with any type.
```C++
std::vector<T> reverse(std::vector<T>);
```
You can try compiling this code, but you will immediately run into an error.
What the heck is `T`? Normally,
when we name a variable, function, or type (e.g., by writing `vector`, `reverse`),
we are referring to its declaration somewhere else. At this time, `T` is not
declared anywhere. It just "appears" in our function's type. To fix this,
we add a declaration for `T` by turning `reverse` into a template:
```C++
template <typename T>
std::vector<T> reverse(std::vector<T>);
```
The new `reverse` above takes two arguments: a type and a list of values of
that type. So, to _really_ call this `reverse`, we need to feed the type
of our list's elements into it. This is normally done automatically
(in C++ and otherwise) but under the hood, invocations might look like this:
```C++
reverse<int>({1,2,3}); // produces 3, 2, 1
reverse<string>({"world", "hello"}) // produces "hello", "world"
```
This is basically what we have to do to write `reverse` on `std::array`, which,
includes an additional parameter that encodes its length. We might start with
the following (using `n` as a placeholder for length, and observing that
reversing an array doesn't change its length):
```C++
std::array<T, n> reverse(std::array<T, n>);
```
Once again, to make this compile, we need to add template parameters for `T` and `n`.
```C++
template <typename T, size_t n>
std::array<T, n> reverse(std::array<T, n>);
```
Now, you might be asking...
{{< dialog >}}
{{< message "question" "reader" >}}
This section is titled "Difficulties with Dependent Types". What's the difficulty?
{{< /message >}}
{{< /dialog >}}
Well, here's the kicker. C++ templates are a __compile-time mechanism__. As
a result, arguments to `template` (like `T` and `n`) must be known when the
program is being compiled. This, in turn, means
{{< sidenote "right" "deptype-note" "the following program doesn't work:" >}}
The observant reader might have noticed that one of the Chapel programs we
saw above, <code>sendEmails</code>, does something similar. The
<code>numEmails</code> argument is used in the type of the
<code>destinationAddrs</code> parameter. That program is valid Chapel.
{{< /sidenote >}}
```C++
void buildArray(size_t len) {
std::array<int, len> myArray;
// do something with myArray
}
```
You can't use these known-length types like `std::array` with any length
that is not known at compile-time. But that's a lot of things! If you're reading
from an input file, chances are, you don't know how big that file is. If you're
writing a web server, you likely don't know the length the HTTP requests.
With every setting a user can tweak when running your code, you sacrifice the
ability to use templated types.
Also, how do you _return_ a `std::array`? If the size of the returned array is
known in advance, you just list that size:
```C++
std::array<int, 10> createArray();
```
If the size is not known at compile-time, you might want to do something like
the following --- using an argument `n` in the type of the returned array ---
but it would not compile:
```C++
auto computeNNumbers(size_t n) -> std::array<int, n>; // not valid C++
```
Moreover, you actually can't use `createArray` to figure out the required
array size, and _then_ return an array that big, even if in the end you
only used compile-time-only computations in the body of `createArray`.
What you would need is to provide a "bundle" of a value and a type that is somehow
built from that value.
```C++
// magic_pair is invented syntax, will not even remotely work
auto createArray() -> magic_pair<size_t size, std::array<int, size>>;
```
This pair contains a `size` (suppose it's known at compilation time for
the purposes of appeasing C++) as well as an array that uses that `size`
as its template argument. This is not real C++ -- not even close -- but
such pairs are a well-known concept. They are known as
[dependent pairs](https://unimath.github.io/agda-unimath/foundation.dependent-pair-types.html),
or, if you're trying to impress people, \(\Sigma\)-types. In Idris, you
could write `createArray` like this:
```Idris
createArray : () -> (n : Nat ** Vec n Int)
```
There are languages out there -- that are not C++, alas -- that support
dependent pairs, and as a result make it more convenient to use types that
depend on values. Not only that, but a lot of these languages do not force
dependent types to be determined at compile-time. You could write that
coveted `readArrayFromFile` function:
```Idris
readArrayFromFile : String -> IO (n : Nat ** Vec n String)
```
Don't mind `IO`; in pure languages like Idris, this type is a necessity when
interacting when reading data in and sending it out. The key is that
`readArrayFromFile` produces, at runtime, a pair of `n`, which is the size
of the resulting array, and a `Vec` of that many `String`s (e.g., one string
per line of the file).
Dependent pairs are cool and very general. However, the end result of
types with bounds which are not determined at compile-time is that you're
_required_ to use dependent pairs. Thus, you must always carry the array's length
together with the array itself.
The bottom line is this:
* In true dependently typed languages, a type that depends on a value (like `Vec`
in Idris) lists that value in its type. When this value is listed by
referring to an identifier --- like `n` in `Vec n String` above --- this
identifier has to be defined somewhere, too. This necessitates dependent pairs,
in which the first element is used syntactically as the "definition point"
of a type-level value. For example, in the following piece of code:
```Idris
(n : Nat ** Vec n String)
```
The `n : Nat` part of the pair serves both to say that the first element
is a natural number, and to introduce a variable `n` that refers to
this number so that the second type (`Vec n String`) can refer to it.
A lot of the time, you end up carrying this extra value (bound to `n` above)
with your type.
* In more mainstream languages, things are even more restricted: dependently
typed values are a compile-time property, and thus, cannot be used with
runtime values like data read from a file, arguments passed in to a function,
etc..
### Hiding Runtime Values from the Type
Let's try to think of ways to make things more convenient. First of all, as
we saw, in Idris, it's possible to use runtime values in types. Not only that,
but Idris is a compiled language, so presumably we can compile dependently typed programs
with runtime-enabled dependent types. The trick is to forget some information:
turn a vector `Vec n String` into two values (the size of the vector and the
vector itself), and forget -- for the purposes of generating code -- that they're
related. Whenever you pass in a `Vec n String`, you can compile that similarly
to how you'd compile passing in a `Nat` and `List String`. Since the program has
already been type checked, you can be assured that you don't encounter cases
when the size and the actual vector are mismatched, or anything else of that
nature.
Additionally, you don't always need the length of the vector at all. In a
good chunk of Idris code, the size arguments are only used to ensure type
correctness and rule out impossible cases; they are never accessed at runtime.
As a result, you can _erase_ the size of the vector altogether. In fact,
[Idris 2](https://github.com/idris-lang/Idris2/) leans on [Quantitative Type Theory](https://bentnib.org/quantitative-type-theory.html)
to make erasure easier.
At this point, one way or another, we've "entangled" the vector with a value
representing its size:
* When a vector of some (unknown, but fixed) length needs to be produced from
a function, we use dependent pairs.
* Even in other cases, when compiling, we end up treating a vector as a
length value and the vector itself.
Generally speaking, a good language design practice is to hide extraneous
complexity, and to remove as much boilerplate as necessary. If the size
value of a vector is always joined at the hip with the vector, can we
avoid having to explicitly write it?
This is pretty much exactly what Chapel does. It _allows_ explicitly writing
the domain of an array as part of its type, but doesn't _require_ it. When
you do write it (re-using my original snippet above):
```Chapel
var A: [0..9] int;
```
What you are really doing is creating a value (the [range](https://chapel-lang.org/docs/primers/ranges.html) `0..9`),
and entangling it with the type of `A`. This is very similar to what a language
like Idris would do under the hood to compile a `Vec`, though it's not quite
the same.
At the same time, you can write code that omits the bounds altogether:
```Chapel
proc processArray(A: [] int): int;
proc createArray(): [] int;
```
In all of these examples, there is an implicit runtime value (the bounds)
that is associated with the array's type. However, we are never forced to
explicitly thread through or include a size. Where reasoning about them is not
necessary, Chapel's domains are hidden away. Chapel refers to the implicitly
present value associated with an array type as its _runtime type_.
I hinted earlier that things are not quite the same in this representation
as they are in my simplified model of Idris. In Idris, as I mentioned earlier,
the values corresponding to vectors' indices can be erased if they are not used.
In Chapel, this is not the case --- a domain always exists at runtime. At the
surface level, this means that you may pay for more than what you use. However,
domains enable a number of interesting patterns of array code. We'll get
to that in a moment; first, I want to address a question that may be on
your mind:
{{< dialog >}}
{{< message "question" "reader" >}}
At this point, this looks just like keeping a <code>.length</code> field as
part of the array value. Most languages do this. What's the difference
between this and Chapel's approach?
{{< /message >}}
{{< /dialog >}}
This is a fair question. The key difference is that the length exists even if an array
does not. The following is valid Chapel code (re-using the `defaultValue`
snippet above):
```Chapel
proc defaultValue(type argType) {
var x: argType;
return x;
}
proc doSomething() {
type MyArray = [1..10] int;
var A = defaultValue(MyArray);
}
```
Here, we created an array `A` with the right size (10 integer elements)
without having another existing array as a reference. This might seem like
a contrived example (I could've just as well written `var A: [1..10] int`),
but the distinction is incredibly helpful for generic programming. Here's
a piece of code from the Chapel standard library, which implements
a part of Chapel's [reduction](https://chapel-lang.org/docs/primers/reductions.html) support:
{{< githubsnippet "chapel-lang/chapel" "e8ff8ee9a67950408cc6d4c3220ac647817ddae3" "modules/internal/ChapelReduce.chpl" "Chapel" 146 >}}
inline proc identity {
var x: chpl__sumType(eltType); return x;
}
{{< /githubsnippet >}}
Identity elements are important when performing operations like sums and products,
for many reasons. For one, they tell you what the sum (e.g.) should be when there
are no elements at all. For another, they can be used as an initial value for
an accumulator. In Chapel, when you are performing a reduction, there is a
good chance you will need several accumulators --- one for each thread performing
a part of the reduction.
That `identity` function looks almost like `defaultValue`! Since it builds the
identity element from the type, and since the type includes the array's dimensions,
summing an array-of-arrays, even if it's empty, will produce the correct output.
```Chapel
type Coordinate = [1..3] real;
var Empty: [0..<0] Coordinate;
writeln(+ reduce Empty); // sum up an empty list of coordinates
```
As I mentioned before, having the domain be part of the type can also enable
indexing optimizations --- without any need for [interprocedural analysis](https://en.wikipedia.org/wiki/Interprocedural_optimization) ---
in functions like `sumElementwise`:
```Chapel
proc sumElementwise(A: [?D] int, B: [D] int) {
var C: [D] int;
for idx in D do
C[idx] = A[idx] + B[idx];
}
```
The C++ equivalent of this function -- using `vectors` to enable arbitrary-size
lists of numbers read from user input, and `.at` to enable bounds checks ---
does not include enough information for this optimization to be possible.
```C++
void sumElementwise(std::vector<int> A, std::vector<int> B) {
std::vector<int> C(A.size());
for (size_t i = 0; i < A.size(); i++) {
C.at(i) = A.at(i) + B.at(i);
}
}
```
All in all, this makes for a very interesting mix of features:
* __Chapel arrays have their bounds as part of types__, like `std::array` in C++
and `Vec` in Idris. This enables all the benefits I've described above.
* __The bounds don't have to be known at compile-time__, like all dependent
types in Idris. This means you can read arrays from files (e.g.) and still
reason about their bounds as part of the type system.
* __Domain information can be hidden when it's not used__, and does not require
explicit additional work like template parameters or dependent pairs.
Most curiously, runtime types only extend to arrays and domains. In that sense,
they are not a general purpose replacement for dependent types. Rather,
they make arrays and domains special, and single out the exact case my
professor was [talking about in the introduction](#bounds-quote). Although
at times I've [twisted Chapel's type system in unconventional ways](https://chapel-lang.org/blog/posts/linear-multistep/)
to simulate dependent types, rarely have I felt a need for them while
programming in Chapel. In that sense --- and in the "practical software engineering"
domain --- I may have been proven wrong.
### Pitfalls of Runtime Types
Should all languages do things the way Chapel does? I don't think so. Like
most features, runtime types like that in Chapel are a language design
tradeoff. Though I've covered their motivation and semantics, perhaps
I should mention the downsides.
The greatest downside is that, generally speaking, _types are not always a
compile-time property_. We saw this earlier with `MyArray`:
```Chapel
type MyArray = [1..10] int;
```
Here, the domain of `MyArray` (one-dimensional with bounds `1..10`) is a runtime
value. It has an
{{< sidenote "right" "dce-note" "execution-time cost." >}}
The execution-time cost is, of course, modulo <a href="https://en.wikipedia.org/wiki/Dead-code_elimination">dead code elimination</a> etc.. If
my snippet made up the entire program being compiled, the end result would
likely do nothing, since <code>MyArray</code> isn't used anywhere.
{{< /sidenote >}}
Moreover, types that serve as arguments to functions (like `argType` for
`defaultValue`), or as their return values (like the result of `chpl__sumType`)
also have an execution-time backing. This is quite different from most
compiled languages. For instance, in C++, templates are "stamped out" when
the program is compiled. A function with a `typename T` template parameter
called with type `int`, in terms of generated code, is always the same as
a function where you search-and-replaced `T` with `int`. This is called
[monomorphization](https://en.wikipedia.org/wiki/Monomorphization), by the
way. In Chapel, however, if the function is instantiated with an array type,
it will have an additional parameter, which represents the runtime component
of the array's type.
The fact that types are runtime entities means that compile-time type checking
is insufficient. Take, for instance, the above `sendEmails` function:
```Chapel
proc sendEmails(numEmails: int, destinationAddrs: [1..numEmails] address) { /* ... */ }
```
Since `numEmails` is a runtime value (it's a regular argument!), we can't ensure
at compile-time that a value of some array matches the `[1..numEmails] address`
type. As a result, Chapel defers bounds checking to when the `sendEmails`
function is invoked.
This leads to some interesting performance considerations. Take two Chapel records
(similar to `struct`s in C++) that simply wrap a value. In one of them,
we provide an explicit type for the field, and in the other, we leave the field
type generic.
```Chapel
record R1 { var field: [1..10] int; }
record R2 { var field; }
var A = [1,2,3,4,5,6,7,8,9,10];
var r1 = new R1(A);
var r2 = new R2(A);
```
In a conversation with a coworker, I learned that these are not the same.
That's because the record `R1` explicitly specifies a type
for `field`. Since the type has a runtime component, the constructor
of `R1` will actually perform a runtime check to ensure that the argument
has 10 elements. `R2` will not do this, since there isn't any other type
to check against.
Of course, the mere existence of an additional runtime component is a performance
consideration. To ensure that Chapel programs perform as well as possible,
the Chapel standard library attempts to avoid using runtime components
wherever possible. This leads to a distinction between a "static type"
(known at compile-time) and a "dynamic type" (requiring a runtime value).
The `chpl__sumType` function we saw mentioned above uses static components of
types, because we don't want each call to `+ reduce` to attempt to run a number
of extraneous runtime queries.
### Conclusion
Though runtime types are not a silver bullet, I find them to be an elegant
middle-ground solution to the problem of tracking array bounds. They enable
optimizations, generic programming, and more, without the complexity of
a fully dependently-typed language. They are also quite unlike anything I've
seen in any other language.
What's more, this post only scratches the surface of what's possible using
arrays and domains. Besides encoding array bounds, domains include information
about how an array is distributed across several nodes (see the
[distributions primer](https://chapel-lang.org/docs/primers/distributions.html)),
and how it's stored in memory (see the [sparse computations](https://chapel-lang.org/blog/posts/announcing-chapel-2.3/#sparse-computations)
section of the recent 2.3 release announcement). In general, they are a very
flavorful component to Chapel's "special sauce" as a language for parallel
computing.
You can read more about arrays and domains in the [corresponding primer](https://chapel-lang.org/docs/primers/arrays.html).

View File

@@ -17,7 +17,8 @@ spend time explaining dependent types, nor the syntax for them in Idris,
which is the language I'll use in this article. Below are a few resources
that should help you get up to speed.
{{< todo >}}List resources{{< /todo >}}
> [!TODO]
> List resources
We've seen that, given a function `F a -> a`, we can define a function
`B -> a`, if `F` is a base functor of the type `B`. However, what if

View File

@@ -0,0 +1,484 @@
---
title: "Some Music Theory From (Computational) First Principles"
date: 2025-09-20T18:36:28-07:00
draft: true
filters: ["./to-parens.lua"]
custom_js: ["playsound.js"]
---
Sound is a perturbation in air pressure that our ear recognizes and interprets.
A note, which is a fundamental building block of music, is a perturbation
that can be described by a sine wave. All sine waves have a specific and
unique frequency. The frequency of a note determines how it sounds (its
_pitch_). Pitch is a matter of our perception; however, it happens to
correlate with frequency, such that notes with higher frequencies
are perceived as higher pitches.
Let's encode a frequency as a class in Python.
```{python}
#| echo: false
import math
import colorsys
```
```{python}
class Frequency:
def __init__(self, hz):
self.hz = hz
```
```{python}
#| echo: false
def points_to_polyline(points, color):
return """<polyline fill="none" stroke="{color}"
stroke-width="4"
points="{points_str}" />""".format(
color=color,
points_str=" ".join(f"{x},{y}" for x, y in points)
)
def wrap_svg(inner_svg, width, height):
return f"""<svg xmlns="http://www.w3.org/2000/svg"
width="{width}" height="{height}"
viewBox="0 0 {width} {height}">
{inner_svg}
</svg>"""
OVERLAY_HUE = 0
class Superimpose:
def __init__(self, *args):
global OVERLAY_HUE
self.args = args
self.color = hex_from_hsv(OVERLAY_HUE, 1, 1)
OVERLAY_HUE = (OVERLAY_HUE + 1.618033988) % 1
def points(self, width, height):
points = []
if not self.args:
return [0 for _ in range(width)]
first_points = [(x, y - height/2) for (x, y) in self.args[0].points(width, height)]
for thing in self.args[1:]:
other_points = thing.points(width, height)
for (i, ((x1, y1), (x2, y2))) in enumerate(zip(first_points, other_points)):
assert(x1 == x2)
first_points[i] = (x1, y1 + (y2 - height/2))
# normalize
max_y = max(abs(y) for x, y in first_points)
if max_y > height / 2:
first_points = [(x, y * (0.9 * height / 2) / max_y) for x, y in first_points]
return [(x, height/2 + y) for x, y in first_points]
def get_color(self):
return self.color
def _repr_svg_(self):
width = 720
height = 100
points = self.points(width, height)
return wrap_svg(
points_to_polyline(points, self.get_color()),
width, height
)
def hugo_shortcode(body):
return "{{" + "< " + body + " >" + "}}"
class PlayNotes:
def __init__(self, *hzs):
self.hzs = hzs
def _repr_html_(self):
toplay = ",".join([str(hz) for hz in self.hzs])
return hugo_shortcode(f"playsound \"{toplay}\"")
class VerticalStack:
def __init__(self, *args):
self.args = args
def _repr_svg_(self):
width = 720
height = 100
buffer = 10
polylines = []
for (i, arg) in enumerate(self.args):
offset = i * (height + buffer)
points = [(x, y+offset) for (x,y) in arg.points(width, height)]
polylines.append(points_to_polyline(points, arg.get_color()))
return wrap_svg(
"".join(polylines),
width, len(self.args) * (height + buffer)
)
def hex_from_hsv(h, s, v):
r, g, b = colorsys.hsv_to_rgb(h, s, v)
return f"#{int(r*255):02x}{int(g*255):02x}{int(b*255):02x}"
def color_for_frequency(hz):
while hz < 261.63:
hz *= 2
while hz > 523.25:
hz /= 2
hue = (math.log2(hz / 261.63) * 360)
return hex_from_hsv(hue / 360, 1, 1)
def Frequency_points(self, width=720, height=100):
# let 261.63 Hz be 5 periods in the width
points = []
period = width / 5
for x in range(width):
y = 0.9 * height / 2 * math.sin(x/period * self.hz / 261.63 * 2 * math.pi)
points.append((x, height/2 - y))
return points
def Frequency_get_color(self):
return color_for_frequency(self.hz)
def Frequency_repr_svg_(self):
# the container on the blog is 720 pixels wide. Use that.
width = 720
height = 100
points = self.points(width, height)
points_str = " ".join(f"{x},{y}" for x, y in points)
return wrap_svg(
points_to_polyline(points, self.get_color()),
width, height
)
Frequency.points = Frequency_points
Frequency.get_color = Frequency_get_color
Frequency._repr_svg_ = Frequency_repr_svg_
```
Let's take a look at a particular frequency. For reason that are historical
and not particularly interesting to me, this frequency is called "middle C".
```{python}
middleC = Frequency(261.63)
middleC
```
```{python}
#| echo: false
PlayNotes(middleC.hz)
```
Great! Now, if you're a composer, you can play this note and make music out
of it. Except, music made with just one note is a bit boring, just like saying
the same word over and over again won't make for an interesting story.
No big deal -- we can construct a whole variety of notes by picking any
other frequency.
```{python}
g4 = Frequency(392.445)
g4
```
```{python}
#| echo: false
PlayNotes(g4.hz)
```
```{python}
fSharp4 = Frequency(370.000694) # we write this F#
fSharp4
```
```{python}
#| echo: false
PlayNotes(fSharp4.hz)
```
This is pretty cool. You can start making melodies with these notes, and sing
some jingles. However, if your friend sings along with you, and happens to
sing F# while you're singing the middle C, it's going to sound pretty awful.
So awful does it sound that somewhere around the 18th century, people started
calling it _diabolus in musica_ (the devil in music).
Why does it sound so bad? Let's take a look at the
{{< sidenote "right" "superposition-note" "superposition" >}}
When waves combine, they follow the principle of superposition. One way
to explain this is that their graphs are added to each other. In practice,
what this means is that two peaks in the same spot combine to a larger
peak, as do two troughs; on the other hand, a peak and a trough "cancel out"
and produce a "flatter" line.
{{< /sidenote >}} of these two
notes, which is what happens when they are played at the same time. For reason I'm
going to explain later, I will multiply each frequency by 4. These frequencies
still sound bad together, but playing them higher lets me "zoom out" and
show you the bigger picture.
```{python}
Superimpose(Frequency(middleC.hz*4), Frequency(fSharp4.hz*4))
```
```{python}
#| echo: false
PlayNotes(middleC.hz, fSharp4.hz)
```
Looking at this picture, we can see that it's far more disordered than the
pure sine waves we've been looking at so far. There's not much of a pattern
to the peaks. This is interpreted by our brain as unpleasant.
{{< dialog >}}
{{< message "question" "reader" >}}
So there's no fundamental reason why these notes sound bad together?
{{< /message >}}
{{< message "answer" "Daniel" >}}
That's right. We might objectively characterize the combination of these
two notes as having a less clear periodicity, but that doesn't mean
that fundamentally it should sound bad. Them sounding good is a purely human
judgement.
{{< /message >}}
{{< /dialog >}}
If picking two notes whose frequencies don't combine into a nice pattern
makes for a bad sound, then to make a good sound we ought to pick two notes
whose frequencies *do* combine into a nice pattern.
Playing the same frequency twice at the same time certainly will do it,
because both waves will have the exact same peaks and troughs.
```{python}
Superimpose(middleC, middleC)
```
In fact, this is just like playing one note, but louder. The fact of the matter
is that *playing any other frequency will mean that not all extremes of the graph align*.
We'll get graphs that are at least a little bink wonky. Intuitively, let's say
that our wonky-ish graph has a nice pattern when they repeat quickly. That way,
there's less time for the graph to do other, unpredictable things.
What's the soonest we can get our combined graph to repeat? It can't
repeat any sooner than either one of the individual notes --- how could it?
We can work with that, though. If we make one note have exactly twice
the frequency of the other, then exactly at the moment the less frequent
note completes its first repetition, the more frequent note will complete
its second. That puts us right back where we started. Here's what this looks
like graphically:
```{python}
twiceMiddleC = Frequency(middleC.hz*2)
VerticalStack(
middleC,
twiceMiddleC,
Superimpose(middleC, twiceMiddleC)
)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, twiceMiddleC.hz)
```
You can easily inspect the new graph to verify that it has a repeating pattern,
and that this pattern repeats exactly as frequently as the lower-frequency
note at the top. Indeed, these two notes sound quite good together. It turns
out, our brains consider them the same in some sense. If you have ever tried to
sing a song that was outside of your range (like me singing along to Taylor Swift),
chances are you sang notes that had half the frequency of the original.
We say that these notes are _in the same pitch class_. While only the first
of the two notes I showed above was the _middle_ C, we call both notes C.
To distinguish different-frequency notes of the same pitch class, we sometimes
number them. The ones in this example were C4 and C5.
We can keep applying this trick to get C6, C7, and so on.
```{python}
C = {}
note = middleC
for i in range(4, 10):
C[i] = note
note = Frequency(note.hz * 2)
C[4]
```
To get C3 from C4, we do the reverse, and halve the frequency.
```{python}
note = middleC
for i in range(4, 0, -1):
C[i] = note
note = Frequency(note.hz / 2)
C[1]
```
You might've already noticed, but I set up this page so that individual
sine waves in the same pitch class have the same color.
All of this puts us almost right back where we started. We might have
different pithes, but we've only got one pitch _class_. Let's try again.
Previously, we made it so the second repeat of one note lined up with the
first repeat of another. What if we pick another note that repeats _three_
times as often instead?
```{python}
thriceMiddleC = Frequency(middleC.hz*3)
VerticalStack(
middleC,
thriceMiddleC,
Superimpose(middleC, thriceMiddleC)
)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, thriceMiddleC.hz)
```
That's not bad! These two sound good together as well, but they are not
in the same pitch class. There's only one problem: these notes are a bit
far apart in terms of pitch. That `triceMiddleC` note is really high!
Wait a minute --- weren't we just talking about singing notes that were too
high at half their original frequency? We can do that here. The result is a
note we've already seen:
```{python}
print(thriceMiddleC.hz/2)
print(g4.hz)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, g4.hz)
```
In the end, we got G4 by multiplying our original frequency by $3/2$. What if
we keep applying this process to find more notes? Let's not even worry
about the specific frequencies (like `261.63`) for a moment. We'll start
with a frequency of $1$. This makes our next frequency $3/2$. Taking this
new frequency and again multiplying it by $3/2$, we get $9/4$. But that
again puts us a little bit high: $9/4 > 2$. We can apply our earlier trick
and divide the result, getting $9/16$.
```{python}
from fractions import Fraction
note = Fraction(1, 1)
seen = {note}
while len(seen) < 6:
new_note = note * 3 / 2
if new_note > 2:
new_note = new_note / 2
seen.add(new_note)
note = new_note
```
For an admittedly handwavy reason, let's also throw in one note that we
get from going _backwards_: dividing by $2/3$ instead of multiplying.
This division puts us below our original frequency, so let's double it.
```{python}
# Throw in one more by going *backwards*. More on that in a bit.
seen.add(Fraction(2/3) * 2)
fractions = sorted(list(seen))
fractions
```
```{python}
frequencies = [middleC.hz * float(frac) for frac in fractions]
frequencies
```
```{python}
steps = [frequencies[i+1]/frequencies[i] for i in range(len(frequencies)-1)]
minstep = min(steps)
print([math.log(step)/math.log(minstep) for step in steps])
```
Since peaks and troughs
in the final result arise when peaks and troughs in the individual waves align,
we want to pick two frequencies that align with a nice pattern.
But that begs the question: what determines
how quickly the pattern of two notes played together repeats?
We can look at things geometrically, by thinking about the distance
between two successive peaks in a single note. This is called the
*wavelength* of a wave. Take a wave with some wavelength $w$.
If we start at a peak, then travel a distance of $w$ from where we started,
there ought to be another peak. Arriving at a distance of $2w$ (still counting
from where we started), we'll see another peak. Continuing in the same pattern,
we'll see peaks at distances of $3w$, $4w$, and so on. For a second wave
with wavelength $v$, the same will be true: peaks at $v$, $2v$, $3v$, and so on.
If we travel a distance that happens to be a multiple of both $w$ and $v$,
then we'll have a place where both peaks are present. At that point, the
pattern starts again. Mathematically, we can
state this as follows:
$$
nw = mv,\ \text{for some integers}\ n, m
$$
As we decided above, we'll try to find a combination of wavelengths/frequencies
where the repetition happens early.
__TODO:__ Follow the logic from Digit Sum Patterns
The number of iterations of the smaller wave before we reach a cycle is:
$$
k = \frac{w}{\text{gcd}(w, v)}
$$
```{python}
Superimpose(Frequency(261.63), Frequency(261.63*2))
```
```{python}
Superimpose(Frequency(261.63*4), Frequency(392.445*4))
```
```{python}
VerticalStack(
Frequency(440*8),
Frequency(450*8),
Superimpose(Frequency(440*8), Frequency(450*8))
)
```
```{python}
from enum import Enum
class Note(Enum):
C = 0
Cs = 1
D = 2
Ds = 3
E = 4
F = 5
Fs = 6
G = 7
Gs = 8
A = 9
As = 10
B = 11
def __add__(self, other):
return Note((self.value + other.value) % 12)
def __sub__(self, other):
return Interval((self.value - other.value + 12) % 12)
```
```{python, echo=false}
class MyClass:
def _repr_svg_(self):
return """<svg xmlns="http://www.w3.org/2000/svg"
width="120" height="120" viewBox="0 0 120 120">
<circle cx="60" cy="60" r="50" fill="red"/>
</svg>"""
MyClass()
```

View File

@@ -0,0 +1,15 @@
window.addEventListener("load", (event) => {
for (const elt of document.getElementsByClassName("mt-sound-play-button")) {
elt.addEventListener("click", (event) => {
const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
for (const freq of event.target.getAttribute("data-sound-info").split(",")) {
const oscillator = audioCtx.createOscillator();
oscillator.type = "sine"; // waveform: sine, square, sawtooth, triangle
oscillator.frequency.value = parseInt(freq); // Hz
oscillator.connect(audioCtx.destination);
oscillator.start();
oscillator.stop(audioCtx.currentTime + 2); // stop after 1 second
}
});
}
});

View File

@@ -0,0 +1,13 @@
function Image(el)
local src = el.src
if src:match("%.svg$") then
local alt = el.caption and pandoc.utils.stringify(el.caption) or ""
local obj = string.format(
'{{< inlinesvg "%s" "%s" >}}',
src, alt
)
return pandoc.RawInline("html", obj)
else
return el
end
end

View File

@@ -0,0 +1,5 @@
function Math(el)
if el.mathtype == "InlineMath" then
return pandoc.RawInline("markdown", "\\(" .. el.text .. "\\)")
end
end

View File

@@ -6,6 +6,6 @@ summary = """
in Agda. The goal is to have a formally verified, yet executable, static
analyzer for a simple language.
"""
status = "ongoing"
status = "complete"
divider = ": "
+++

View File

@@ -25,13 +25,16 @@ class KatexRenderer
end
def substitute(content)
found_any = false
rendered = content.gsub /\\\(((?:[^\\]|\\[^\)])*)\\\)/ do |match|
found_any = true
render(false, $~[1])
end
rendered = rendered.gsub /\$\$((?:[^\$]|$[^\$])*)\$\$/ do |match|
found_any = true
render(true, $~[1])
end
return rendered
return rendered, found_any
end
end
@@ -58,8 +61,20 @@ renderer = KatexRenderer.new(katex)
files.each do |file|
puts "Rendering file: #{file}"
document = Nokogiri::HTML.parse(File.open(file))
found_any = false
document.search('//*[not(ancestor-or-self::code or ancestor-or-self::script)]/text()').each do |t|
t.replace(renderer.substitute(t.content))
rendered, found_any_in_text = renderer.substitute(t.content)
found_any ||= found_any_in_text
t.replace(rendered)
end
# If we didn't find any mathematical equations, no need to include KaTeX CSS.
# Disabled here because Bergamot technically doesn't require math blocks
# on the page but does need the CSS.
#
# unless found_any
# document.css('link[href$="katex.css"], link[href$="katex.min.css"]').each(&:remove)
# end
File.write(file, document.to_html(encoding: 'UTF-8'))
end

View File

@@ -1,2 +1,2 @@
{{ $style := resources.Get "scss/donate.scss" | resources.ToCSS | resources.Minify }}
{{ $style := resources.Get "scss/donate.scss" | css.Sass | resources.Minify }}
<link rel="stylesheet" href="{{ $style.Permalink }}">

View File

@@ -1,2 +1,2 @@
{{ $style := resources.Get "scss/gmachine.scss" | resources.ToCSS | resources.Minify }}
{{ $style := resources.Get "scss/gmachine.scss" | css.Sass | resources.Minify }}
<link rel="stylesheet" href="{{ $style.Permalink }}">

View File

@@ -0,0 +1 @@
<object type="image/svg+xml" data="{{ .Get 0 }}" aria-label="{{ .Get 1 }}"></object>

View File

@@ -0,0 +1 @@
<button class="mt-sound-play-button" data-sound-info="{{ .Get 0 }}">Play</button>

View File

@@ -1,2 +1,2 @@
{{ $style := resources.Get "scss/stack.scss" | resources.ToCSS | resources.Minify }}
{{ $style := resources.Get "scss/stack.scss" | css.Sass | resources.Minify }}
<link rel="stylesheet" href="{{ $style.Permalink }}">

View File

@@ -3,7 +3,7 @@
<html lang="{{ .Site.Language.Lang }}">
{{- partial "head.html" . -}}
<body>
{{ $voidcss := resources.Get "scss/thevoid.scss" | resources.ToCSS | resources.Minify }}
{{ $voidcss := resources.Get "scss/thevoid.scss" | css.Sass | resources.Minify }}
<link rel="stylesheet" href="{{ $voidcss.Permalink }}">
{{- partial "header.html" . -}}
<div class="container"><hr class="header-divider"></div>