--- title: "Some Music Theory From (Computational) First Principles" date: 2025-09-20T18:36:28-07:00 draft: true filters: ["./to-parens.lua"] custom_js: ["playsound.js"] --- Sound is a perturbation in air pressure that our ear recognizes and interprets. A note, which is a fundamental building block of music, is a perturbation that can be described by a sine wave. All sine waves have a specific and unique frequency. The frequency of a note determines how it sounds (its _pitch_). Pitch is a matter of our perception; however, it happens to correlate with frequency, such that notes with higher frequencies are perceived as higher pitches. Let's encode a frequency as a class in Python. ```{python} #| echo: false import math import colorsys ``` ```{python} class Frequency: def __init__(self, hz): self.hz = hz ``` ```{python} #| echo: false def points_to_polyline(points, color): return """""".format( color=color, points_str=" ".join(f"{x},{y}" for x, y in points) ) def wrap_svg(inner_svg, width, height): return f""" {inner_svg} """ OVERLAY_HUE = 0 class Superimpose: def __init__(self, *args): global OVERLAY_HUE self.args = args self.color = hex_from_hsv(OVERLAY_HUE, 1, 1) OVERLAY_HUE = (OVERLAY_HUE + 1.618033988) % 1 def points(self, width, height): points = [] if not self.args: return [0 for _ in range(width)] first_points = [(x, y - height/2) for (x, y) in self.args[0].points(width, height)] for thing in self.args[1:]: other_points = thing.points(width, height) for (i, ((x1, y1), (x2, y2))) in enumerate(zip(first_points, other_points)): assert(x1 == x2) first_points[i] = (x1, y1 + (y2 - height/2)) # normalize max_y = max(abs(y) for x, y in first_points) if max_y > height / 2: first_points = [(x, y * (0.9 * height / 2) / max_y) for x, y in first_points] return [(x, height/2 + y) for x, y in first_points] def get_color(self): return self.color def _repr_svg_(self): width = 720 height = 100 points = self.points(width, height) return wrap_svg( points_to_polyline(points, self.get_color()), width, height ) def hugo_shortcode(body): return "{{" + "< " + body + " >" + "}}" class PlayNotes: def __init__(self, *hzs): self.hzs = hzs def _repr_html_(self): toplay = ",".join([str(hz) for hz in self.hzs]) return hugo_shortcode(f"playsound \"{toplay}\"") class VerticalStack: def __init__(self, *args): self.args = args def _repr_svg_(self): width = 720 height = 100 buffer = 10 polylines = [] for (i, arg) in enumerate(self.args): offset = i * (height + buffer) points = [(x, y+offset) for (x,y) in arg.points(width, height)] polylines.append(points_to_polyline(points, arg.get_color())) return wrap_svg( "".join(polylines), width, len(self.args) * (height + buffer) ) def hex_from_hsv(h, s, v): r, g, b = colorsys.hsv_to_rgb(h, s, v) return f"#{int(r*255):02x}{int(g*255):02x}{int(b*255):02x}" def color_for_frequency(hz): while hz < 261.63: hz *= 2 while hz > 523.25: hz /= 2 hue = (math.log2(hz / 261.63) * 360) return hex_from_hsv(hue / 360, 1, 1) def Frequency_points(self, width=720, height=100): # let 261.63 Hz be 5 periods in the width points = [] period = width / 5 for x in range(width): y = 0.9 * height / 2 * math.sin(x/period * self.hz / 261.63 * 2 * math.pi) points.append((x, height/2 - y)) return points def Frequency_get_color(self): return color_for_frequency(self.hz) def Frequency_repr_svg_(self): # the container on the blog is 720 pixels wide. Use that. width = 720 height = 100 points = self.points(width, height) points_str = " ".join(f"{x},{y}" for x, y in points) return wrap_svg( points_to_polyline(points, self.get_color()), width, height ) Frequency.points = Frequency_points Frequency.get_color = Frequency_get_color Frequency._repr_svg_ = Frequency_repr_svg_ ``` Let's take a look at a particular frequency. For reason that are historical and not particularly interesting to me, this frequency is called "middle C". ```{python} middleC = Frequency(261.63) middleC ``` ```{python} #| echo: false PlayNotes(middleC.hz) ``` Great! Now, if you're a composer, you can play this note and make music out of it. Except, music made with just one note is a bit boring, just like saying the same word over and over again won't make for an interesting story. No big deal -- we can construct a whole variety of notes by picking any other frequency. ```{python} g4 = Frequency(392.445) g4 ``` ```{python} #| echo: false PlayNotes(g4.hz) ``` ```{python} fSharp4 = Frequency(370.000694) # we write this F# fSharp4 ``` ```{python} #| echo: false PlayNotes(fSharp4.hz) ``` This is pretty cool. You can start making melodies with these notes, and sing some jingles. However, if your friend sings along with you, and happens to sing F# while you're singing the middle C, it's going to sound pretty awful. So awful does it sound that somewhere around the 18th century, people started calling it _diabolus in musica_ (the devil in music). Why does it sound so bad? Let's take a look at the {{< sidenote "right" "superposition-note" "superposition" >}} When waves combine, they follow the principle of superposition. One way to explain this is that their graphs are added to each other. In practice, what this means is that two peaks in the same spot combine to a larger peak, as do two troughs; on the other hand, a peak and a trough "cancel out" and produce a "flatter" line. {{< /sidenote >}} of these two notes, which is what happens when they are played at the same time. For reason I'm going to explain later, I will multiply each frequency by 4. These frequencies still sound bad together, but playing them higher lets me "zoom out" and show you the bigger picture. ```{python} Superimpose(Frequency(middleC.hz*4), Frequency(fSharp4.hz*4)) ``` ```{python} #| echo: false PlayNotes(middleC.hz, fSharp4.hz) ``` Looking at this picture, we can see that it's far more disordered than the pure sine waves we've been looking at so far. There's not much of a pattern to the peaks. This is interpreted by our brain as unpleasant. {{< dialog >}} {{< message "question" "reader" >}} So there's no fundamental reason why these notes sound bad together? {{< /message >}} {{< message "answer" "Daniel" >}} That's right. We might objectively characterize the combination of these two notes as having a less clear periodicity, but that doesn't mean that fundamentally it should sound bad. Them sounding good is a purely human judgement. {{< /message >}} {{< /dialog >}} If picking two notes whose frequencies don't combine into a nice pattern makes for a bad sound, then to make a good sound we ought to pick two notes whose frequencies *do* combine into a nice pattern. Playing the same frequency twice at the same time certainly will do it, because both waves will have the exact same peaks and troughs. ```{python} Superimpose(middleC, middleC) ``` In fact, this is just like playing one note, but louder. The fact of the matter is that *playing any other frequency will mean that not all extremes of the graph align*. We'll get graphs that are at least a little bink wonky. Intuitively, let's say that our wonky-ish graph has a nice pattern when they repeat quickly. That way, there's less time for the graph to do other, unpredictable things. What's the soonest we can get our combined graph to repeat? It can't repeat any sooner than either one of the individual notes --- how could it? We can work with that, though. If we make one note have exactly twice the frequency of the other, then exactly at the moment the less frequent note completes its first repetition, the more frequent note will complete its second. That puts us right back where we started. Here's what this looks like graphically: ```{python} twiceMiddleC = Frequency(middleC.hz*2) VerticalStack( middleC, twiceMiddleC, Superimpose(middleC, twiceMiddleC) ) ``` ```{python} #| echo: false PlayNotes(middleC.hz, twiceMiddleC.hz) ``` You can easily inspect the new graph to verify that it has a repeating pattern, and that this pattern repeats exactly as frequently as the lower-frequency note at the top. Indeed, these two notes sound quite good together. It turns out, our brains consider them the same in some sense. If you have ever tried to sing a song that was outside of your range (like me singing along to Taylor Swift), chances are you sang notes that had half the frequency of the original. We say that these notes are _in the same pitch class_. While only the first of the two notes I showed above was the _middle_ C, we call both notes C. To distinguish different-frequency notes of the same pitch class, we sometimes number them. The ones in this example were C4 and C5. We can keep applying this trick to get C6, C7, and so on. ```{python} C = {} note = middleC for i in range(4, 10): C[i] = note note = Frequency(note.hz * 2) C[4] ``` To get C3 from C4, we do the reverse, and halve the frequency. ```{python} note = middleC for i in range(4, 0, -1): C[i] = note note = Frequency(note.hz / 2) C[1] ``` You might've already noticed, but I set up this page so that individual sine waves in the same pitch class have the same color. All of this puts us almost right back where we started. We might have different pithes, but we've only got one pitch _class_. Let's try again. Previously, we made it so the second repeat of one note lined up with the first repeat of another. What if we pick another note that repeats _three_ times as often instead? ```{python} thriceMiddleC = Frequency(middleC.hz*3) VerticalStack( middleC, thriceMiddleC, Superimpose(middleC, thriceMiddleC) ) ``` ```{python} #| echo: false PlayNotes(middleC.hz, thriceMiddleC.hz) ``` That's not bad! These two sound good together as well, but they are not in the same pitch class. There's only one problem: these notes are a bit far apart in terms of pitch. That `triceMiddleC` note is really high! Wait a minute --- weren't we just talking about singing notes that were too high at half their original frequency? We can do that here. The result is a note we've already seen: ```{python} print(thriceMiddleC.hz/2) print(g4.hz) ``` ```{python} #| echo: false PlayNotes(middleC.hz, g4.hz) ``` In the end, we got G4 by multiplying our original frequency by $3/2$. What if we keep applying this process to find more notes? Let's not even worry about the specific frequencies (like `261.63`) for a moment. We'll start with a frequency of $1$. This makes our next frequency $3/2$. Taking this new frequency and again multiplying it by $3/2$, we get $9/4$. But that again puts us a little bit high: $9/4 > 2$. We can apply our earlier trick and divide the result, getting $9/16$. ```{python} from fractions import Fraction note = Fraction(1, 1) seen = {note} while len(seen) < 6: new_note = note * 3 / 2 if new_note > 2: new_note = new_note / 2 seen.add(new_note) note = new_note ``` For an admittedly handwavy reason, let's also throw in one note that we get from going _backwards_: dividing by $2/3$ instead of multiplying. This division puts us below our original frequency, so let's double it. ```{python} # Throw in one more by going *backwards*. More on that in a bit. seen.add(Fraction(2/3) * 2) fractions = sorted(list(seen)) fractions ``` ```{python} frequencies = [middleC.hz * float(frac) for frac in fractions] frequencies ``` ```{python} steps = [frequencies[i+1]/frequencies[i] for i in range(len(frequencies)-1)] minstep = min(steps) print([math.log(step)/math.log(minstep) for step in steps]) ``` Since peaks and troughs in the final result arise when peaks and troughs in the individual waves align, we want to pick two frequencies that align with a nice pattern. But that begs the question: what determines how quickly the pattern of two notes played together repeats? We can look at things geometrically, by thinking about the distance between two successive peaks in a single note. This is called the *wavelength* of a wave. Take a wave with some wavelength $w$. If we start at a peak, then travel a distance of $w$ from where we started, there ought to be another peak. Arriving at a distance of $2w$ (still counting from where we started), we'll see another peak. Continuing in the same pattern, we'll see peaks at distances of $3w$, $4w$, and so on. For a second wave with wavelength $v$, the same will be true: peaks at $v$, $2v$, $3v$, and so on. If we travel a distance that happens to be a multiple of both $w$ and $v$, then we'll have a place where both peaks are present. At that point, the pattern starts again. Mathematically, we can state this as follows: $$ nw = mv,\ \text{for some integers}\ n, m $$ As we decided above, we'll try to find a combination of wavelengths/frequencies where the repetition happens early. __TODO:__ Follow the logic from Digit Sum Patterns The number of iterations of the smaller wave before we reach a cycle is: $$ k = \frac{w}{\text{gcd}(w, v)} $$ ```{python} Superimpose(Frequency(261.63), Frequency(261.63*2)) ``` ```{python} Superimpose(Frequency(261.63*4), Frequency(392.445*4)) ``` ```{python} VerticalStack( Frequency(440*8), Frequency(450*8), Superimpose(Frequency(440*8), Frequency(450*8)) ) ``` ```{python} from enum import Enum class Note(Enum): C = 0 Cs = 1 D = 2 Ds = 3 E = 4 F = 5 Fs = 6 G = 7 Gs = 8 A = 9 As = 10 B = 11 def __add__(self, other): return Note((self.value + other.value) % 12) def __sub__(self, other): return Interval((self.value - other.value + 12) % 12) ``` ```{python, echo=false} class MyClass: def _repr_svg_(self): return """ """ MyClass() ```