Files
blog-static/content/blog/music_theory/index.qmd

485 lines
14 KiB
Plaintext
Raw Normal View History

---
title: "Some Music Theory From (Computational) First Principles"
date: 2025-09-20T18:36:28-07:00
draft: true
filters: ["./to-parens.lua"]
custom_js: ["playsound.js"]
---
Sound is a perturbation in air pressure that our ear recognizes and interprets.
A note, which is a fundamental building block of music, is a perturbation
that can be described by a sine wave. All sine waves have a specific and
unique frequency. The frequency of a note determines how it sounds (its
_pitch_). Pitch is a matter of our perception; however, it happens to
correlate with frequency, such that notes with higher frequencies
are perceived as higher pitches.
Let's encode a frequency as a class in Python.
```{python}
#| echo: false
import math
import colorsys
```
```{python}
class Frequency:
def __init__(self, hz):
self.hz = hz
```
```{python}
#| echo: false
def points_to_polyline(points, color):
return """<polyline fill="none" stroke="{color}"
stroke-width="4"
points="{points_str}" />""".format(
color=color,
points_str=" ".join(f"{x},{y}" for x, y in points)
)
def wrap_svg(inner_svg, width, height):
return f"""<svg xmlns="http://www.w3.org/2000/svg"
width="{width}" height="{height}"
viewBox="0 0 {width} {height}">
{inner_svg}
</svg>"""
OVERLAY_HUE = 0
class Superimpose:
def __init__(self, *args):
global OVERLAY_HUE
self.args = args
self.color = hex_from_hsv(OVERLAY_HUE, 1, 1)
OVERLAY_HUE = (OVERLAY_HUE + 1.618033988) % 1
def points(self, width, height):
points = []
if not self.args:
return [0 for _ in range(width)]
first_points = [(x, y - height/2) for (x, y) in self.args[0].points(width, height)]
for thing in self.args[1:]:
other_points = thing.points(width, height)
for (i, ((x1, y1), (x2, y2))) in enumerate(zip(first_points, other_points)):
assert(x1 == x2)
first_points[i] = (x1, y1 + (y2 - height/2))
# normalize
max_y = max(abs(y) for x, y in first_points)
if max_y > height / 2:
first_points = [(x, y * (0.9 * height / 2) / max_y) for x, y in first_points]
return [(x, height/2 + y) for x, y in first_points]
def get_color(self):
return self.color
def _repr_svg_(self):
width = 720
height = 100
points = self.points(width, height)
return wrap_svg(
points_to_polyline(points, self.get_color()),
width, height
)
def hugo_shortcode(body):
return "{{" + "< " + body + " >" + "}}"
class PlayNotes:
def __init__(self, *hzs):
self.hzs = hzs
def _repr_html_(self):
toplay = ",".join([str(hz) for hz in self.hzs])
return hugo_shortcode(f"playsound \"{toplay}\"")
class VerticalStack:
def __init__(self, *args):
self.args = args
def _repr_svg_(self):
width = 720
height = 100
buffer = 10
polylines = []
for (i, arg) in enumerate(self.args):
offset = i * (height + buffer)
points = [(x, y+offset) for (x,y) in arg.points(width, height)]
polylines.append(points_to_polyline(points, arg.get_color()))
return wrap_svg(
"".join(polylines),
width, len(self.args) * (height + buffer)
)
def hex_from_hsv(h, s, v):
r, g, b = colorsys.hsv_to_rgb(h, s, v)
return f"#{int(r*255):02x}{int(g*255):02x}{int(b*255):02x}"
def color_for_frequency(hz):
while hz < 261.63:
hz *= 2
while hz > 523.25:
hz /= 2
hue = (math.log2(hz / 261.63) * 360)
return hex_from_hsv(hue / 360, 1, 1)
def Frequency_points(self, width=720, height=100):
# let 261.63 Hz be 5 periods in the width
points = []
period = width / 5
for x in range(width):
y = 0.9 * height / 2 * math.sin(x/period * self.hz / 261.63 * 2 * math.pi)
points.append((x, height/2 - y))
return points
def Frequency_get_color(self):
return color_for_frequency(self.hz)
def Frequency_repr_svg_(self):
# the container on the blog is 720 pixels wide. Use that.
width = 720
height = 100
points = self.points(width, height)
points_str = " ".join(f"{x},{y}" for x, y in points)
return wrap_svg(
points_to_polyline(points, self.get_color()),
width, height
)
Frequency.points = Frequency_points
Frequency.get_color = Frequency_get_color
Frequency._repr_svg_ = Frequency_repr_svg_
```
Let's take a look at a particular frequency. For reason that are historical
and not particularly interesting to me, this frequency is called "middle C".
```{python}
middleC = Frequency(261.63)
middleC
```
```{python}
#| echo: false
PlayNotes(middleC.hz)
```
Great! Now, if you're a composer, you can play this note and make music out
of it. Except, music made with just one note is a bit boring, just like saying
the same word over and over again won't make for an interesting story.
No big deal -- we can construct a whole variety of notes by picking any
other frequency.
```{python}
g4 = Frequency(392.445)
g4
```
```{python}
#| echo: false
PlayNotes(g4.hz)
```
```{python}
fSharp4 = Frequency(370.000694) # we write this F#
fSharp4
```
```{python}
#| echo: false
PlayNotes(fSharp4.hz)
```
This is pretty cool. You can start making melodies with these notes, and sing
some jingles. However, if your friend sings along with you, and happens to
sing F# while you're singing the middle C, it's going to sound pretty awful.
So awful does it sound that somewhere around the 18th century, people started
calling it _diabolus in musica_ (the devil in music).
Why does it sound so bad? Let's take a look at the
{{< sidenote "right" "superposition-note" "superposition" >}}
When waves combine, they follow the principle of superposition. One way
to explain this is that their graphs are added to each other. In practice,
what this means is that two peaks in the same spot combine to a larger
peak, as do two troughs; on the other hand, a peak and a trough "cancel out"
and produce a "flatter" line.
{{< /sidenote >}} of these two
notes, which is what happens when they are played at the same time. For reason I'm
going to explain later, I will multiply each frequency by 4. These frequencies
still sound bad together, but playing them higher lets me "zoom out" and
show you the bigger picture.
```{python}
Superimpose(Frequency(middleC.hz*4), Frequency(fSharp4.hz*4))
```
```{python}
#| echo: false
PlayNotes(middleC.hz, fSharp4.hz)
```
Looking at this picture, we can see that it's far more disordered than the
pure sine waves we've been looking at so far. There's not much of a pattern
to the peaks. This is interpreted by our brain as unpleasant.
{{< dialog >}}
{{< message "question" "reader" >}}
So there's no fundamental reason why these notes sound bad together?
{{< /message >}}
{{< message "answer" "Daniel" >}}
That's right. We might objectively characterize the combination of these
two notes as having a less clear periodicity, but that doesn't mean
that fundamentally it should sound bad. Them sounding good is a purely human
judgement.
{{< /message >}}
{{< /dialog >}}
If picking two notes whose frequencies don't combine into a nice pattern
makes for a bad sound, then to make a good sound we ought to pick two notes
whose frequencies *do* combine into a nice pattern.
Playing the same frequency twice at the same time certainly will do it,
because both waves will have the exact same peaks and troughs.
```{python}
Superimpose(middleC, middleC)
```
In fact, this is just like playing one note, but louder. The fact of the matter
is that *playing any other frequency will mean that not all extremes of the graph align*.
We'll get graphs that are at least a little bink wonky. Intuitively, let's say
that our wonky-ish graph has a nice pattern when they repeat quickly. That way,
there's less time for the graph to do other, unpredictable things.
What's the soonest we can get our combined graph to repeat? It can't
repeat any sooner than either one of the individual notes --- how could it?
We can work with that, though. If we make one note have exactly twice
the frequency of the other, then exactly at the moment the less frequent
note completes its first repetition, the more frequent note will complete
its second. That puts us right back where we started. Here's what this looks
like graphically:
```{python}
twiceMiddleC = Frequency(middleC.hz*2)
VerticalStack(
middleC,
twiceMiddleC,
Superimpose(middleC, twiceMiddleC)
)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, twiceMiddleC.hz)
```
You can easily inspect the new graph to verify that it has a repeating pattern,
and that this pattern repeats exactly as frequently as the lower-frequency
note at the top. Indeed, these two notes sound quite good together. It turns
out, our brains consider them the same in some sense. If you have ever tried to
sing a song that was outside of your range (like me singing along to Taylor Swift),
chances are you sang notes that had half the frequency of the original.
We say that these notes are _in the same pitch class_. While only the first
of the two notes I showed above was the _middle_ C, we call both notes C.
To distinguish different-frequency notes of the same pitch class, we sometimes
number them. The ones in this example were C4 and C5.
We can keep applying this trick to get C6, C7, and so on.
```{python}
C = {}
note = middleC
for i in range(4, 10):
C[i] = note
note = Frequency(note.hz * 2)
C[4]
```
To get C3 from C4, we do the reverse, and halve the frequency.
```{python}
note = middleC
for i in range(4, 0, -1):
C[i] = note
note = Frequency(note.hz / 2)
C[1]
```
You might've already noticed, but I set up this page so that individual
sine waves in the same pitch class have the same color.
All of this puts us almost right back where we started. We might have
different pithes, but we've only got one pitch _class_. Let's try again.
Previously, we made it so the second repeat of one note lined up with the
first repeat of another. What if we pick another note that repeats _three_
times as often instead?
```{python}
thriceMiddleC = Frequency(middleC.hz*3)
VerticalStack(
middleC,
thriceMiddleC,
Superimpose(middleC, thriceMiddleC)
)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, thriceMiddleC.hz)
```
That's not bad! These two sound good together as well, but they are not
in the same pitch class. There's only one problem: these notes are a bit
far apart in terms of pitch. That `triceMiddleC` note is really high!
Wait a minute --- weren't we just talking about singing notes that were too
high at half their original frequency? We can do that here. The result is a
note we've already seen:
```{python}
print(thriceMiddleC.hz/2)
print(g4.hz)
```
```{python}
#| echo: false
PlayNotes(middleC.hz, g4.hz)
```
In the end, we got G4 by multiplying our original frequency by $3/2$. What if
we keep applying this process to find more notes? Let's not even worry
about the specific frequencies (like `261.63`) for a moment. We'll start
with a frequency of $1$. This makes our next frequency $3/2$. Taking this
new frequency and again multiplying it by $3/2$, we get $9/4$. But that
again puts us a little bit high: $9/4 > 2$. We can apply our earlier trick
and divide the result, getting $9/16$.
```{python}
from fractions import Fraction
note = Fraction(1, 1)
seen = {note}
while len(seen) < 6:
new_note = note * 3 / 2
if new_note > 2:
new_note = new_note / 2
seen.add(new_note)
note = new_note
```
For an admittedly handwavy reason, let's also throw in one note that we
get from going _backwards_: dividing by $2/3$ instead of multiplying.
This division puts us below our original frequency, so let's double it.
```{python}
# Throw in one more by going *backwards*. More on that in a bit.
seen.add(Fraction(2/3) * 2)
fractions = sorted(list(seen))
fractions
```
```{python}
frequencies = [middleC.hz * float(frac) for frac in fractions]
frequencies
```
```{python}
steps = [frequencies[i+1]/frequencies[i] for i in range(len(frequencies)-1)]
minstep = min(steps)
print([math.log(step)/math.log(minstep) for step in steps])
```
Since peaks and troughs
in the final result arise when peaks and troughs in the individual waves align,
we want to pick two frequencies that align with a nice pattern.
But that begs the question: what determines
how quickly the pattern of two notes played together repeats?
We can look at things geometrically, by thinking about the distance
between two successive peaks in a single note. This is called the
*wavelength* of a wave. Take a wave with some wavelength $w$.
If we start at a peak, then travel a distance of $w$ from where we started,
there ought to be another peak. Arriving at a distance of $2w$ (still counting
from where we started), we'll see another peak. Continuing in the same pattern,
we'll see peaks at distances of $3w$, $4w$, and so on. For a second wave
with wavelength $v$, the same will be true: peaks at $v$, $2v$, $3v$, and so on.
If we travel a distance that happens to be a multiple of both $w$ and $v$,
then we'll have a place where both peaks are present. At that point, the
pattern starts again. Mathematically, we can
state this as follows:
$$
nw = mv,\ \text{for some integers}\ n, m
$$
As we decided above, we'll try to find a combination of wavelengths/frequencies
where the repetition happens early.
__TODO:__ Follow the logic from Digit Sum Patterns
The number of iterations of the smaller wave before we reach a cycle is:
$$
k = \frac{w}{\text{gcd}(w, v)}
$$
```{python}
Superimpose(Frequency(261.63), Frequency(261.63*2))
```
```{python}
Superimpose(Frequency(261.63*4), Frequency(392.445*4))
```
```{python}
VerticalStack(
Frequency(440*8),
Frequency(450*8),
Superimpose(Frequency(440*8), Frequency(450*8))
)
```
```{python}
from enum import Enum
class Note(Enum):
C = 0
Cs = 1
D = 2
Ds = 3
E = 4
F = 5
Fs = 6
G = 7
Gs = 8
A = 9
As = 10
B = 11
def __add__(self, other):
return Note((self.value + other.value) % 12)
def __sub__(self, other):
return Interval((self.value - other.value + 12) % 12)
```
```{python, echo=false}
class MyClass:
def _repr_svg_(self):
return """<svg xmlns="http://www.w3.org/2000/svg"
width="120" height="120" viewBox="0 0 120 120">
<circle cx="60" cy="60" r="50" fill="red"/>
</svg>"""
MyClass()
```