blog-static/content/blog/modulo_patterns/index.md

39 KiB

title date tags description
Digit Sum Patterns and Modular Arithmetic 2021-12-30T15:42:40-08:00
Ruby
Mathematics
In this article, we explore the patterns created by remainders from division.

When I was in elementary school, our class was briefly visited by our school's headmaster. He was there for a demonstration, probably intended to get us to practice our multiplication tables. "Pick a number", he said, "And I'll teach you how to draw a pattern from it."

The procedure was rather simple:

  1. Pick a number between 2 and 8 (inclusive).
  2. Start generating positive multiples of this number. If you picked 8, your multiples would be 8, 16, 24, and so on.
  3. If a multiple is more than one digit long, sum its digits. For instance, for 16, write 1+6=7. If the digits add up to a number that's still more than 1 digit long, add up the digits of that number (and so on).
  4. Start drawing on a grid. For each resulting number, draw that many squares in one direction, and then "turn". Using 8 as our example, we could draw 8 up, 7 to the right, 6 down, 5 to the left, and so on.
  5. As soon as you come back to where you started ("And that will always happen", said my headmaster), you're done. You should have drawn a pretty pattern!

Sticking with our example of 8, the pattern you'd end up with would be something like this:

{{< figure src="pattern_8.svg" caption="Pattern generated by the number 8." class="tiny" alt="Pattern generated by the number 8." >}}

Before we go any further, let's observe that it's not too hard to write code to do this. For instance, the "add digits" algorithm can be naively written by turning the number into a string (17 becomes "17"), splitting that string into characters ("17" becomes ["1", "7"]), turning each of these character back into numbers (the array becomes [1, 7]) and then computing the sum of the array, leaving 8.

{{< codelines "Ruby" "patterns/patterns.rb" 3 8 >}}

We may now encode the "drawing" logic. At any point, there's a "direction" we're going - which I'll denote by the Ruby symbols :top, :bottom, :left, and :right. Each step, we take the current x,y coordinates (our position on the grid), and shift them by n in a particular direction dir. We also return the new direction alongside the new coordinates.

{{< codelines "Ruby" "patterns/patterns.rb" 10 21 >}}

The top-level algorithm is captured by the following code, which produces a list of coordinates in the order that you'd visit them.

{{< codelines "Ruby" "patterns/patterns.rb" 23 35 >}}

I will omit the code for generating SVGs from the body of the article -- you can always find the complete source code in this blog's Git repo (or by clicking the link in the code block above). Let's run the code on a few other numbers. Here's one for 4, for instance:

{{< figure src="pattern_4.svg" caption="Pattern generated by the number 4." class="tiny" alt="Pattern generated by the number 4." >}}

And one more for 2, which I don't find as pretty.

{{< figure src="pattern_2.svg" caption="Pattern generated by the number 2." class="tiny" alt="Pattern generated by the number 2." >}}

It really does always work out! Young me was amazed, though I would often run out of space on my grid paper to complete the pattern, or miscount the length of my lines partway in. It was only recently that I started thinking about why it works, and I think I figured it out. Let's take a look!

Is a number divisible by 3?

You might find the whole "add up the digits of a number" thing familiar, and for good reason: it's one way to check if a number is divisible by 3. The quick summary of this result is,

If the sum of the digits of a number is divisible by 3, then so is the whole number.

For example, the sum of the digits of 72 is 9, which is divisible by 3; 72 itself is correspondingly also divisible by 3, since 24*3=72. On the other hand, the sum of the digits of 82 is 10, which is not divisible by 3; 82 isn't divisible by 3 either (it's one more than 81, which is divisible by 3).

Why does this work? Let's talk remainders.

If a number doesn't cleanly divide another (we're sticking to integers here), what's left behind is the remainder. For instance, dividing 7 by 3 leaves us with a remainder 1. On the other hand, if the remainder is zero, then that means that our dividend is divisible by the divisor (what a mouthful). In mathematics, we typically use \(a|b\) to say \(a\) divides \(b\), or, as we have seen above, that the remainder of dividing \(b\) by \(a\) is zero.

Working with remainders actually comes up pretty frequently in discrete math. A well-known example I'm aware of is the RSA algorithm, which works with remainders resulting from dividing by a product of two large prime numbers. But what's a good way to write, in numbers and symbols, the claim that "\(a\) divides \(b\) with remainder \(r\)"? Well, we know that dividing yields a quotient (possibly zero) and a remainder (also possibly zero). Let's call the quotient \(q\). {{< sidenote "right" "r-less-note" "Then, we know that when dividing b by a we have:" >}} It's important to point out that for the equation in question to represent division with quotient q and remainder r, it must be that r is less than a. Otherwise, you could write r = s + a for some s, and end up with {{< latex >}} \begin{aligned} & b = qa + r \ \Rightarrow\ & b = qa + (s + a) \ \Rightarrow\ & b = (q+1)a + s \end{aligned} {{< /latex >}}

In plain English, if r is bigger than a after you've divided, you haven't taken out "as much a from your dividend as you could", and the actual quotient is larger than q. {{< /sidenote >}}

{{< latex >}} \begin{aligned} & b = qa + r \ \Rightarrow\ & b-r = qa \ \end{aligned} {{< /latex >}}

We only really care about the remainder here, not the quotient, since it's the remainder that determines if something is divisible or not. From the form of the second equation, we can deduce that \(b-r\) is divisible by \(a\) (it's literally equal to \(a\) times \(q\), so it must be divisible). Thus, we can write:

{{< latex >}} a|(b-r) {{< /latex >}}

There's another notation for this type of statement, though. To say that the difference between two numbers is divisible by a third number, we write:

{{< latex >}} b \equiv r\ (\text{mod}\ a) {{< /latex >}}

Some things that seem like they would work from this "equation-like" notation do, indeed, work. For instance, we can "add two equations" (I'll omit the proof here; jump down to this section to see how it works):

{{< latex >}} \textbf{if}\ a \equiv b\ (\text{mod}\ k)\ \textbf{and}\ c \equiv d, (\text{mod}\ k),\ \textbf{then}\ a+c \equiv b+d\ (\text{mod}\ k). {{< /latex >}}

Multiplying both sides by the same number (call it \(n\)) also works (once again, you can find the proof in this section below).

{{< latex >}} \textbf{if}\ a \equiv b\ (\text{mod}\ k),\ \textbf{then}\ na \equiv nb\ (\text{mod}\ k). {{< /latex >}}

Ok, that's a lot of notation and other stuff. Let's talk specifics. Of particular interest is the number 10, since our number system is base ten (the value of a digit is multiplied by 10 for every place it moves to the left). The remainder of 10 when dividing by 3 is 1. Thus, we have:

{{< latex >}} 10 \equiv 1\ (\text{mod}\ 3) {{< /latex >}}

From this, we can deduce that multiplying by 10, when it comes to remainders from dividing by 3, is the same as multiplying by 1. We can clearly see this by multiplying both sides by \(n\). In our notation:

{{< latex >}} 10n \equiv n\ (\text{mod}\ 3) {{< /latex >}}

But wait, there's more. Take any power of ten, be it a hundred, a thousand, or a million. Multiplying by that number is also equivalent to multiplying by 1!

{{< latex >}} 10^kn = 10\times10\times...\times 10n \equiv n\ (\text{mod}\ 3) {{< /latex >}}

We can put this to good use. Let's take a large number that's divisible by 3. This number will be made of multiple digits, like \(d_2d_1d_0\). Note that I do not mean multiplication here, but specifically that each \(d_i\) is a number between 0 and 9 in a particular place in the number -- it's a digit. Now, we can write:

{{< latex >}} \begin{aligned} 0 &\equiv d_2d_1d_0 \ & = 100d_2 + 10d_1 + d_0 \ & \equiv d_2 + d_1 + d_0 \end{aligned} {{< /latex >}}

We have just found that \(d_2+d_1+d_0 \equiv 0\ (\text{mod}\ 3)\), or that the sum of the digits is also divisible by 3. The logic we use works in the other direction, too: if the sum of the digits is divisible, then so is the actual number.

There's only one property of the number 3 we used for this reasoning: that \(10 \equiv 1\ (\text{mod}\ 3)\). But it so happens that there's another number that has this property: 9. This means that to check if a number is divisible by nine, we can also check if the sum of the digits is divisible by 9. Try it on 18, 27, 81, and 198.

Here's the main takeaway: summing the digits in the way described by my headmaster is the same as figuring out the remainder of the number from dividing by 9. Well, almost. The difference is the case of 9 itself: the remainder here is 0, but we actually use 9 to draw our line. We can actually try just using 0. Here's the updated sum_digits code:

def sum_digits(n)
    n % 9
end

The results are similarly cool:

{{< figure src="pattern_8_mod.svg" caption="Pattern generated by the number 8." class="tiny" alt="Pattern generated by the number 8 by just using remainders." >}} {{< figure src="pattern_4_mod.svg" caption="Pattern generated by the number 4." class="tiny" alt="Pattern generated by the number 4 by just using remainders." >}} {{< figure src="pattern_2_mod.svg" caption="Pattern generated by the number 2." class="tiny" alt="Pattern generated by the number 2 by just using remainders." >}}

Sequences of Remainders

So now we know what the digit-summing algorithm is really doing. But that algorithm isn't all there is to it! We're repeatedly applying this algorithm over and over to multiples of another number. How does this work, and why does it always loop around? Why don't we ever spiral farther and farther from the center?

First, let's take a closer look at our sequence of multiples. Suppose we're working with multiples of some number \(n\). Let's write \(a_i\) for the \(i\)th multiple. Then, we end up with:

{{< latex >}} \begin{aligned} a_1 &= n \ a_2 &= 2n \ a_3 &= 3n \ a_4 &= 4n \ ... \ a_i &= in \end{aligned} {{< /latex >}}

This is actually called an arithmetic sequence; for each multiple, the number increases by \(n\).

Here's a first seemingly trivial point: at some time, the remainder of \(a_i\) will repeat. There are only so many remainders when dividing by nine: specifically, the only possible remainders are the numbers 0 through 8. We can invoke the pigeonhole principle and say that after 9 multiples, we will have to have looped. Another way of seeing this is as follows:

{{< latex >}} \begin{aligned} & 9 \equiv 0\ (\text{mod}\ 9) \ \Rightarrow\ & 9n \equiv 0\ (\text{mod}\ 9) \ \Rightarrow\ & 10n \equiv n\ (\text{mod}\ 9) \ \end{aligned} {{< /latex >}}

The 10th multiple is equivalent to n, and will thus have the same remainder. The looping may happen earlier: the simplest case is if we pick 9 as our \(n\), in which case the remainder will always be 0.

Repeating remainders alone do not guarantee that we will return to the center. The repeating sequence 1,2,3,4 will certainly cause a spiral. The reason is that, if we start facing "up", we will always move up 1 and down 3 after four steps, leaving us 2 steps below where we started. Next, the cycle will repeat, and since turning four times leaves us facing "up" again, we'll end up getting further away. Here's a picture that captures this behvior:

{{< figure src="pattern_1_4.svg" caption="Spiral generated by the number 1 with divisor 4." class="tiny" alt="Spiral generated by the number 1 by summing digits." >}}

And here's one more where the cycle repeats after 8 steps instead of 4. You can see that it also leads to a spiral:

{{< figure src="pattern_1_8.svg" caption="Spiral generated by the number 1 with divisor 8." class="tiny" alt="Spiral generated by the number 1 by summing digits." >}}

From this, we can devise a simple condition to prevent spiraling -- the length of the sequence before it repeats cannot be a multiple of 4. This way, whenever the cycle restarts, it will do so in a different direction: backwards, turned once to the left, or turned once to the right. Clearly repeating the sequence backwards is guaranteed to take us back to the start. The same is true for the left and right-turn sequences, though it's less obvious. If drawing our sequence once left us turned to the right, drawing our sequence twice will leave us turned more to the right. On a grid, two right turns are the same as turning around. The third repetition will then undo the effects of the first one (since we're facing backwards now), and the fourth will undo the effects of the second.

There is an exception to this multiple-of-4 rule: if a sequence makes it back to the origin right before it starts over. In that case, even if it's facing the very same direction it started with, all is well -- things are just like when it first started, and the cycle repeats. I haven't found a sequence that does this, so for our purposes, we'll stick with avoiding multiples of 4.

Okay, so we want to avoid cycles with lengths divisible by four. What does it mean for a cycle to be of length k? It effectively means the following:

{{< latex >}} \begin{aligned} & a_{k+1} \equiv a_1\ (\text{mod}\ 9) \ \Rightarrow\ & (k+1)n \equiv n\ (\text{mod}\ 9) \ \Rightarrow\ & kn \equiv 0\ (\text{mod}\ 9) \ \end{aligned} {{< /latex >}}

If we could divide both sides by \(k\), we could go one more step:

{{< latex >}} n \equiv 0\ (\text{mod}\ 9) \ {{< /latex >}}

That is, \(n\) would be divisible by 9! This would contradict our choice of \(n\) to be between 2 and 8. What went wrong? Turns out, it's that last step: we can't always divide by \(k\). Some values of \(k\) are special, and it's only those values that can serve as cycle lengths without causing a contradiction. So, what are they?

They're values that have a common factor with 9 (an incomplete explanation is in this section below). There are many numbers that have a common factor with 9; 3, 6, 9, 12, and so on. However, those can't all serve as cycle lengths: as we said, cycles can't get longer than 9. This leaves us with 3, 6, and 9 as possible cycle lengths, none of which are divisible by 4. We've eliminated the possibility of spirals!

Generalizing to Arbitrary Divisors

The trick was easily executable on paper because there's an easy way to compute the remainder of a number when dividing by 9 (adding up the digits). However, we have a computer, and we don't need to fall back on such cool-but-complicated techniques. To replicate our original behavior, we can just write:

def sum_digits(n)
  x = n % 9
  x == 0 ? 9 : x
end

But now, we can change the 9 to something else. There are some numbers we'd like to avoid - specifically, we want to avoid those numbers that would allow for cycles of length 4 (or of a length divisible by 4). If we didn't avoid them, we might run into infinite loops, where our pencil might end up moving further and further from the center.

Actually, let's revisit that. When we were playing with paths of length \(k\) while dividing by 9, we noted that the only possible values of \(k\) are those that share a common factor with 9, specifically 3, 6 and 9. But that's not quite as strong as it could be: try as you might, but you will not find a cycle of length 6 when dividing by 9. The same is true if we pick 6 instead of 9, and try to find a cycle of length 4. Even though 4 does have a common factor with 6, and thus is not ruled out as a valid cycle by our previous condition, we don't find any cycles of length 4.

So what is it that really determines if there can be cycles or not?

Let's do some more playing around. What are the actual cycle lengths when we divide by 9? For all but two numbers, the cycle lengths are 9. The two special numbers are 6 and 3, and they end up with a cycle length of 3. From this, we can say that the cycle length seems to depend on whether or not our \(n\) has any common factors with the divisor.

Let's explore this some more with a different divisor, say 12. We fill find that 8 has a cycle length of 3, 7 has a cycle length of 12, 9 has a cycle length of 4. What's happening here? To see, let's divide 12 by these cycle lengths. For 8, we get (12/3) = 4. For 7, this works out to 1. For 9, it works out to 3. These new numbers, 4, 1, and 3, are actually the greatest common factors of 8, 7, and 3 with 12, respectively. The greatest common factor of two numbers is the largest number that divides them both. We thus write down our guess for the length of a cycle:

{{< latex >}} k = \frac{d}{\text{gcd}(d,n)} {{< /latex >}}

Where \(d\) is our divisor, which has been 9 until just recently, and \(\text{gcd}(d,n)\) is the greatest common factor of \(d\) and \(n\). This equation is in agreement with our experiment for \(d = 9\), too. Why might this be? Recall that sequences with period \(k\) imply the following congruence:

{{< latex >}} kn \equiv 0\ (\text{mod}\ d) {{< /latex >}}

Here I've replaced 9 with \(d\), since we're trying to make it work for any divisor, not just 9. Now, suppose the greatest common divisor of \(n\) and \(d\) is some number \(f\). Then, since this number divides \(n\) and \(d\), we can write \(n=fm\) for some \(m\), and \(d=fg\) for some \(g\). We can rewrite our congruence as follows:

{{< latex >}} kfm \equiv 0\ (\text{mod}\ fg) {{< /latex >}}

We can simplify this a little bit. Recall that what this congruence really means is that the difference of \(kfm\) and \(0\), which is just \(kfm\), is divisible by \(fg\):

{{< latex >}} fg|kfm {{< /latex >}}

But if \(fg\) divides \(kfm\), it must be that \(g\) divides \(km\)! This, in turn, means we can write:

{{< latex >}} g|km {{< /latex >}}

Can we distill this statement even further? It turns out that we can. Remember that we got \(g\) and \(m\) by dividing \(d\) and \(n\) by their greatest common factor, \(f\). This, in turn, means that \(g\) and \(m\) have no more common factors that aren't equal to 1 (see this section below). From this, in turn, we can deduce that \(m\) is not relevant to \(g\) dividing \(km\), and we get:

{{< latex >}} g|k {{< /latex >}}

That is, we get that \(k\) must be divisible by \(g\). Recall that we got \(g\) by dividing \(d\) by \(f\), which is our largest common factor -- aka \(\text{gcd}(d,n)\). We can thus write:

{{< latex >}} \frac{d}{\text{gcd}(d,n)}|k {{< /latex >}}

Let's stop and appreciate this result. We have found a condition that is required for a sequnce of remainders from dividing by \(d\) (which was 9 in the original problem) to repeat after \(k\) numbers. Furthermore, all of our steps can be performed in reverse, which means that if a \(k\) matches this conditon, we can work backwards and determine that a sequence of numbers has to repeat after \(k\) steps.

Multiple \(k\)s will match this condition, and that's not surprising. If a sequence repeats after 5 steps, it also repeats after 10, 15, and so on. We're interested in the first time our sequences repeat after taking any steps, which means we have to pick the smallest possible non-zero value of \(k\). The smallest number divisible by \(d/\text{gcd}(d,n)\) is \(d/\text{gcd}(d,n)\) itself. We thus confirm our hypothesis:

{{< latex >}} k = \frac{d}{\text{gcd}(d,n)} {{< /latex >}}

Lastly, recall that our patterns would spiral away from the center whenever a \(k\) is a multiple of 4. Now that we know what \(k\) is, we can restate this as "\(d/\text{gcd}(d,n)\) is divisible by 4". But if we pick \(n=d-1\), the greatest common factor has to be \(1\) (see this section below), so we can even further simplify this "\(d\) is divisible by 4". Thus, we can state simply that any divisor divisible by 4 is off-limits, as it will induce loops. For example, pick \(d=4\). Running our algorithm {{< sidenote "right" "constructive-note" "for n=d-1=3," >}} Did you catch that? From our work above, we didn't just find a condition that would prevent spirals; we also found the precise number that would result in a spiral if this condition were violated! This is because our proof is constructive: instead of just claiming the existence of a thing, it also shows how to get that thing. Our proof in the earlier section (which claimed that the divisor 9 would never create spirals) went by contradiction, which was not constructive. Repeating that proof for a general d wouldn't have told us the specific numbers that would spiral.

This is the reason that direct proofs tend to be preferred over proofs by contradiction. {{< /sidenote >}} we indeed find an infinite spiral:

{{< figure src="pattern_3_4.svg" caption="Spiral generated by the number 3 with divisor 4." class="tiny" alt="Spiral generated by the number 3 by summing digits." >}}

Let's try again. Pick \(d=8\); then, for \(n=d-1=7\), we also get a spiral:

{{< figure src="pattern_7_8.svg" caption="Spiral generated by the number 7 with divisor 8." class="tiny" alt="Spiral generated by the number 7 by summing digits." >}}

A poem comes to mind:

Turning and turning in the widening gyre

The falcon cannot hear the falconner;

Fortunately, there are plenty of numbers that are not divisible by four, and we can pick any of them! I'll pick primes for good measure. Here are a few good ones from using 13 (which corresponds to summing digits of base-14 numbers):

{{< figure src="pattern_8_13.svg" caption="Pattern generated by the number 8 in base 14." class="tiny" alt="Pattern generated by the number 8 by summing digits." >}} {{< figure src="pattern_4_13.svg" caption="Pattern generated by the number 4 in base 14." class="tiny" alt="Pattern generated by the number 4 by summing digits." >}}

Here's one from dividing by 17 (base-18 numbers).

{{< figure src="pattern_5_17.svg" caption="Pattern generated by the number 5 in base 18." class="tiny" alt="Pattern generated by the number 5 by summing digits." >}}

Finally, base-30:

{{< figure src="pattern_2_29.svg" caption="Pattern generated by the number 2 in base 30." class="tiny" alt="Pattern generated by the number 2 by summing digits." >}}

{{< figure src="pattern_6_29.svg" caption="Pattern generated by the number 6 in base 30." class="tiny" alt="Pattern generated by the number 6 by summing digits." >}}

Generalizing to Arbitrary Numbers of Directions

What if we didn't turn 90 degrees each time? What, if, instead, we turned 120 degrees (so that turning 3 times, not 4, would leave you facing the same direction you started)? We can pretty easily do that, too. Let's call this number of turns \(c\). Up until now, we had \(c=4\).

First, let's update our condition. Before, we had "\(d\) cannot be divisible by 4". Now, we aren't constraining ourselves to only 4, but rather using a generic variable \(c\). We then end up with "\(d\) cannot be divisible by \(c\)". For instance, suppose we kept our divisor as 9 for the time being, but started turning 3 times instead of 4. This violates our divisibility condtion, and we once again end up with a spiral:

{{< figure src="pattern_8_9_t3.svg" caption="Pattern generated by the number 8 in base 10 while turning 3 times." class="tiny" alt="Pattern generated by the number 3 by summing digits and turning 120 degrees." >}}

If, on the other hand, we pick \(d=8\) and \(c=3\), we get patterns for all numbers just like we hoped. Here's one such pattern:

{{< figure src="pattern_7_8_t3.svg" caption="Pattern generated by the number 7 in base 9 while turning 3 times." class="tiny" alt="Pattern generated by the number 7 by summing digits in base 9 and turning 120 degrees." >}}

Hold on a moment; it's actully not so obvious why our condition still works. When we just turned on a grid, things were simple. As long as we didn't end up facing the same way we started, we will eventually perform the exact same motions in reverse. The same is not true when turning 120 degrees, like we suggested. Here's an animated circle all of the turns we would make:

{{< figure src="turn_3_1.gif" caption="Orientations when turning 120 degrees" class="small" alt="Possible orientations when turning 120 degrees." >}}

We never quite do the exact opposite of any one of our movements. So then, will we come back to the origin anyway? Well, let's start simple. Suppose we always turn by exactly one 120-degree increment (we might end up turning more or less, just like we may end up turning left, right, or back in the 90 degree case). Each time you face a particular direciton, after performing a cycle, you will have moved some distance away from when you started, and turned 120 degrees. If you then repeat the cycle, you will once again move by the same offset as before, but this time the offset will be rotated 120 degrees, and you will have rotated a total of 240 degrees. Finally, performing the cycle a third time, you'll have moved by the same offset (rotated 240 degrees).

If you overaly each offset such that their starting points overlap, they will look very similar to that circle above. And now, here's the beauty: you can arrange these rotated offsets into a triangle:

{{< figure src="turn_3_anim.gif" caption="Triangle formed by three 120-degree turns." class="small" alt="Triangle formed by three 120-degree turns." >}}

As long as you rotate by the same amount each time (and you will, since the cycle length determines how many times you turn, and the cycle length never changes), you can do so for any number of directions. For instance, here's a similar visualization in which there are 5 possible directions, and where each turn is consequently 72 degrees:

{{< figure src="turn_5_anim.gif" caption="Pentagon formed by five 72-degree turns." class="small" alt="Pentagon formed by five 72-degree turns." >}}

Each of these polygon shapes forms a loop. If you walk along its sides, you will eventually end up exactly where you started. This confirms that if you end up making one turn at the end of each cycle, you will eventually end up right where you started.

Things aren't always as simple as making a single turn, though. Let's go back to the version of the problem in which we have 3 possible directions, and think about what would happen if we turned by 240 degrees at a time: 2 turns instead of 1?

Even though we first turn a whole 240 degrees, the second time we turn we "overshoot" our initial bearing, and end up at 120 degrees compared to it. As soon as we turn 240 more degrees (turning the third time), we end up back at 0. In short, even though we "visited" each bearing in a different order, we visited them all, and exactly once at that. Here's a visualization:

{{< figure src="turn_3_2.gif" caption="Orientations when turning 120 degrees, twice at a time" class="small" alt="Possible orientations when turning 120 degrees, twice at a time." >}}

Note that even though in the above picture it looks like we're just turning left instead of right, that's not the case; a single turn of 240 degrees is more than half the circle, so our second bearing ends up on the left side of the circle even though we turn right.

Just to make sure we really see what's happening, let's try this when there are 5 possible directions, and when we still make two turns (now of 72 degrees each)

{{< figure src="turn_5_2.gif" caption="Orientations when turning 72 degrees, twice at a time" class="small" alt="Possible orientations when turning 72 degrees, twice at a time." >}}

Let's try put some mathematical backing to this "visited them all" idea, and turning in general. First, observe that as soon as we turn 360 degrees, it's as good as not turning at all - we end up facing up again. If we turned 480 degrees (that is, two turns of 240 degrees each), the first 360 can be safely ignored, since it puts us where we started; only the 120 degrees that remain are needed to figure out our final bearing. In short, the final direction we're facing is the remainder from dividing by 360. We already know how to formulate this using modular arithmetic: if we turn \(t\) degrees \(k\) times, and end up at final bearing (remainder) \(b\), this is captured by:

{{< latex >}} kt \equiv b\ (\text{mod}\ 360) {{< /latex >}}

Of course, if we end up facing the same way we started, we get the familiar equivalence:

{{< latex >}} kt \equiv 0\ (\text{mod}\ 360) {{< /latex >}}

Even though the variables in this equivalence mean different things now than they did last time we saw it, the mathematical properties remain the same. For instance, we can say that after \(360/\text{gcd}(360, t)\) turns, we'll end up facing the way that we started.

So far, so good. What I don't like about this, though, is that we have all of these numbers of degrees all over our equations: 72 degrees, 144 degrees, and so forth. However, something like 73 degrees (if there are five possible directions) is just not a valid bearing, and nor is 71. We have so many possible degrees (360 of them, to be exact), but we're only using a handful! That's wasteful. Instead, observe that for \(c\) possible turns, the smallest possible turn angle is \(360/c\). Let's call this angle \(\theta\) (theta). Now, notice that we always turn in multiples of \(\theta\): a single turn moves us \(\theta\) degrees, two turns move us \(2\theta\) degrees, and so on. If we define \(r\) to be the number of turns that we find ourselves rotated by after a single cycle, we have \(t=r\theta\), and our turning equation can be written as:

{{< latex >}} kr\theta \equiv 0\ (\text{mod}\ c\theta) {{< /latex >}}

Now, once again, recall that the above equivalence is just notation for the following:

{{< latex >}} \begin{aligned} & c\theta|kr\theta \ \Leftrightarrow\ & c|kr \end{aligned} {{< /latex >}}

And finally, observing that \(kr=kr-0\), we have:

{{< latex >}} kr \equiv 0\ (\text{mod}\ c) {{< /latex >}}

This equivalence says the same thing as our earlier one; however, instead of being in terms of degrees, it's in terms of the number of turns \(c\) and the turns-per-cycle \(r\). Now, recall once again that the smallest number of steps \(k>0\) for which this equivalence holds is \(k = c/\text{gcd}(c,r)\).

We're close now: we have a sequence of \(k\) steps that will lead us back to the beginning. What's left is to show that these \(k\) steps are evenly distributed throughout our circle, which is the key property that makes it possible for us to make a polygon out of them (and thus end up back where we started).

To show this, say that we have a largest common divisor \(f=\text{gcd}(c,r)\), and that \(c=fe\) and \(r=fs\). We can once again "divide through" by \(f\), and get:

{{< latex >}} ks \equiv 0\ (\text{mod}\ e) {{< /latex >}}

Now, we know that \(\text{gcd}(e,s)=1\) (see this section below), and thus:

{{< latex >}} k = e/\text{gcd}(e,s) = e {{< /latex >}}

That is, our cycle will repeat after \(e\) remainders. But wait, we've only got \(e\) possible remainders: the numbers \(0\) through \(e-1\)! Thus, for a cycle to repeat after \(e\) remainders, all possible remainders must occur. For a concrete example, take \(e=5\); our remainders will be the set \(\{0,1,2,3,4\}\). Now, let's "multiply back through" by \(f\):

{{< latex >}} kfs \equiv 0\ (\text{mod}\ fe) {{< /latex >}}

We still have \(e\) possible remainders, but this time they are multiplied by \(f\). For example, taking \(e\) to once again be equal to \(5\), we have the set of possible remainders \(\{0, f, 2f, 3f, 4f\}\). The important bit is that these remainders are all evenly spaced, and that space between them is \(f=\text{gcd}(c,r)\).

Let's recap: we have confirmed that for \(c\) possible turns (4 in our original formulation), and \(r\) turns at a time, we will always loop after \(k=c/\text{gcd}(c,r)\) steps, evenly spaced out at \(\text{gcd}(c,r)\) turns. No specific properties from \(c\) or \(r\) are needed for this to work. Finally, recall from the previous section that \(r\) is zero (and thus, our pattern breaks down) whenever the divisor \(d\) (9 in our original formulation) is itself divisible by \(c\). And so, as long as we pick a system with \(c\) possible directions and divisor \(d\), we will always loop back and create a pattern as long as \(c\nmid d\) (\(c\) does not divide \(d\)).

Let's try it out! There's a few pictures below. When reading the captions, keep in mind that the base is one more than the divisor (we started with numbers in the usual base 10, but divided by 9).

{{< figure src="pattern_1_7_t5.svg" caption="Pattern generated by the number 1 in base 8 while turning 5 times." class="tiny" alt="Pattern generated by the number 1 by summing digits in base 8 and turning 72 degrees." >}}

{{< figure src="pattern_3_4_t7.svg" caption="Pattern generated by the number 3 in base 5 while turning 7 times." class="tiny" alt="Pattern generated by the number 3 by summing digits in base 5 and turning 51 degrees." >}}

{{< figure src="pattern_3_11_t6.svg" caption="Pattern generated by the number 3 in base 12 while turning 6 times." class="tiny" alt="Pattern generated by the number 3 by summing digits in base 12 and turning 60 degrees." >}}

{{< figure src="pattern_2_11_t7.svg" caption="Pattern generated by the number 2 in base 12 while turning 7 times." class="tiny" alt="Pattern generated by the number 2 by summing digits in base 12 and turning 51 degrees." >}}

Conclusion

Today we peeked under the hood of a neat mathematical trick that was shown to me by my headmaster over 10 years ago now. Studying what it was that made this trick work led us to play with the underlying mathematics some more, and extend the trick to more situations (and prettier patterns). I hope you found this as interesting as I did!

By the way, the kind of math that we did in this article is most closely categorized as number theory. Check it out if you're interested!

Finally, a huge thank you to Arthur for checking my math, helping me with proofs, and proofreading the article.

All that remains are some proofs I omitted from the original article since they were taking up a lot of space (and were interrupting the flow of the explanation). They are listed below.

Referenced Proofs

Adding Two Congruences

Claim: If for some numbers \(a\), \(b\), \(c\), \(d\), and \(k\), we have \(a \equiv b\ (\text{mod}\ k)\) and \(c \equiv d\ (\text{mod}\ k)\), then it's also true that \(a+c \equiv b+d\ (\text{mod}\ k)\).

Proof: By definition, we have \(k|(a-b)\) and \(k|(c-d)\). This, in turn, means that for some \(i\) and \(j\), \(a-b=ik\) and \(c-d=jk\). Add both sides to get: {{< latex >}} \begin{aligned} & (a-b)+(c-d) = ik+jk \ \Rightarrow\ & (a+c)-(b+d) = (i+j)k \ \Rightarrow\ & k\ |\left[(a+c)-(b+d)\right]\ \Rightarrow\ & a+c \equiv b+d\ (\text{mod}\ k) \ \end{aligned} {{< /latex >}} \(\blacksquare\)

Multiplying Both Sides of a Congruence

Claim: If for some numbers \(a\), \(b\), \(n\) and \(k\), we have \(a \equiv b\ (\text{mod}\ k)\) then we also have that \(an \equiv bn\ (\text{mod}\ k)\).

Proof: By definition, we have \(k|(a-b)\). Since multiplying \(a-b\) but \(n\) cannot make it not divisible by \(k\), we also have \(k|\left[n(a-b)\right]\). Distributing \(n\), we have \(k|(na-nb)\). By definition, this means \(na\equiv nb\ (\text{mod}\ k)\).

\(\blacksquare\)

Invertible Numbers \(\text{mod}\ d\) Share no Factors with \(d\)

Claim: A number \(k\) is only invertible (can be divided by) in \(\text{mod}\ d\) if \(k\) and \(d\) share no common factors (except 1).

Proof: Write \(\text{gcd}(k,d)\) for the greatest common factor divisor of \(k\) and \(d\). Another important fact (not proven here, but see something like this), is that if \(\text{gcd}(k,d) = r\), then the smallest possible number that can be made by adding and subtracting \(k\)s and \(d\)s is \(r\). That is, for some \(i\) and \(j\), the smallest possible positive value of \(ik + jd\) is \(r\).

Now, note that \(d \equiv 0\ (\text{mod}\ d)\). Multiplying both sides by \(j\), get \(jd\equiv 0\ (\text{mod}\ d)\). This, in turn, means that the smallest possible value of \(ik+jd \equiv ik\) is \(r\). If \(r\) is bigger than 1 (i.e., if \(k\) and \(d\) have common factors), then we can't pick \(i\) such that \(ik\equiv1\), since we know that \(r>1\) is the least possible value we can make. There is therefore no multiplicative inverse to \(k\). Alternatively worded, we cannot divide by \(k\).

\(\blacksquare\)

Numbers Divided by Their \(\text{gcd}\) Have No Common Factors

Claim: For any two numbers \(a\) and \(b\) and their largest common factor \(f\), if \(a=fc\) and \(b=fd\), then \(c\) and \(d\) have no common factors other than 1 (i.e., \(\text{gcd}(c,d)=1\)).

Proof: Suppose that \(c\) and \(d\) do have sommon factor, \(e\neq1\). In that case, we have \(c=ei\) and \(d=ej\) for some \(i\) and \(j\). Then, we have \(a=fei\), and \(b=fej\). From this, it's clear that both \(a\) and \(b\) are divisible by \(fe\). Since \(e\) is greater than \(1\), \(fe\) is greater than \(f\). But our assumptions state that \(f\) is the greatest common divisor of \(a\) and \(b\)! We have arrived at a contradiction.

Thus, \(c\) and \(d\) cannot have a common factor other than 1.

\(\blacksquare\)

Divisors of \(n\) and \(n-1\).

Claim: For any \(n\), \(\text{gcd}(n,n-1)=1\). That is, \(n\) and \(n-1\) share no common divisors.

Proof: Suppose some number \(f\) divides both \(n\) and \(n-1\). In that case, we can write \(n=af\), and \((n-1)=bf\) for some \(a\) and \(b\). Subtracting one equation from the other:

{{< latex >}} 1 = (a-b)f {{< /latex >}} But this means that 1 is divisible by \(f\)! That's only possible if \(f=1\). Thus, the only number that divides \(n\) and \(n-1\) is 1; that's our greatest common factor.

\(\blacksquare\)