Browse Source

Add draft of the first portion of day 8 Coq writeup.

donations
Danila Fedorin 11 months ago
parent
commit
60eb50737d
  1. 276
      content/blog/01_aoc_coq.md

276
content/blog/01_aoc_coq.md

@ -0,0 +1,276 @@
---
title: "Advent of Code in Coq - Day 8"
date: 2020-12-31T17:55:25-08:00
tags: ["Advent of Code", "Coq"]
draft: true
---
Huh? We're on day 8? What happened to days 2 through 7?
Well, for the most part, I didn't think they were that interesting from the Coq point of view.
Day 7 got close, but not close enough to inspire me to create a formalization. Day 8, on the other
hand, is
{{< sidenote "right" "pl-note" "quite interesting," >}}
Especially to someone like me who's interested in programming languages!
{{< /sidenote >}} and took quite some time to formalize.
As before, here's an (abridged) description of the problem:
> Given a tiny assembly-like language, determine the state of its accumulator
> when the same instruction is executed twice.
Before we start on the Coq formalization, let's talk about an idea from
Programming Language Theory (PLT), _big step operational semantics_.
### Big Step Operational Semantics
What we have in Advent of Code's Day 8 is, undeniably, a small programming language.
We are tasked with executing this language, or, in PLT lingo, defining its _semantics_.
There are many ways of doing this - at university, I've been taught of [denotational](https://en.wikipedia.org/wiki/Denotational_semantics), [axiomatic](https://en.wikipedia.org/wiki/Axiomatic_semantics),
and [operational](https://en.wikipedia.org/wiki/Operational_semantics) semantics.
I believe that Coq's mechanism of inductive definitions lends itself very well
to operational semantics, so we'll take that route. But even "operational semantics"
doesn't refer to a concrete technique - we have a choice between small-step (structural) and
big-step (natural) operational semantics. The former describe the minimal "steps" a program
takes as it's being evaluated, while the latter define the final results of evaluating a program.
I decided to go with big-step operational semantics, since they're more intutive (natural!).
So, how does one go about "[defining] the final results of evaluating a program?" Most commonly,
we go about using _inference rules_. Let's talk about those next.
#### Inference Rules
Inference rules are a very general notion. The describe how we can determine (infer) a conclusion
from a set of assumption. It helps to look at an example. Here's a silly little inference rule:
{{< latex >}}
\frac
{\text{I'm allergic to cats} \quad \text{My friend has a cat}}
{\text{I will not visit my friend very much}}
{{< /latex >}}
It reads, "if I'm allergic to cats, and if my friend has a cat, then I will not visit my friend very much".
Here, "I'm allergic to cats" and "my friend has a cat" are _premises_, and "I will not visit my friend very much" is
a _conclusion_. An inference rule states that if all its premises are true, then its conclusion must be true.
Here's another inference rule, this time with some mathematical notation instead of words:
{{< latex >}}
\frac
{n < m}
{n + 1 < m + 1}
{{< /latex >}}
This one reads, "if \\(n\\) is less than \\(m\\), then \\(n+1\\) is less than \\(m+1\\)". We can use inference
rules to define various constructs. As an example, let's define what it means for a natural number to be even.
It takes two rules:
{{< latex >}}
\frac
{}
{0 \; \text{is even}}
\quad
\frac
{n \; \text{is even}}
{n+2 \; \text{is even}}
{{< /latex >}}
First of all, zero is even. We take this as fact - there are no premises for the first rule, so they
are all trivially true. Next, if a number is even, then adding 2 to that number results in another
even number. Using the two of these rules together, we can correctly determine whether any number
is or isn't even. We start knowing that 0 is even. Adding 2 we learn that 2 is even, and adding 2
again we see that 4 is even, as well. We can continue this to determine that 6, 8, 10, and so on
are even too. Never in this process will we visit the numbers 1 or 3 or 5, and that's good - they're not even!
Let's now extend this notion to programming languages, starting with a simple arithmetic language.
This language is made up of natural numbers and the \\(\square\\) operation, which represents the addition
of two numbers. Again, we need two rules:
{{< latex >}}
\frac
{n \in \mathbb{N}}
{n \; \text{evaluates to} \; n}
\quad
\frac
{e_1 \; \text{evaluates to} \; n_1 \quad e_2 \; \text{evaluates to} \; n_2}
{e_1 \square e_2 \; \text{evaluates to} \; n_1 + n_2}
{{< /latex >}}
First, let me explain myself. I used \\(\square\\) to demonstrate two important points. First, languages can be made of
any kind of characters we want; it's the rules that we define that give these languages meaning.
Second, while \\(\square\\) is the addition operation _in our language_, \\(+\\) is the _mathematical addition operator_.
They are not the same - we use the latter to define how the former works.
Finally, writing "evaluates to" gets quite tedious, especially for complex languages. Instead,
PLT people use notation to make their semantics more concise. The symbol \\(\Downarrow\\) is commonly
used to mean "evaluates to"; thus, \\(e \Downarrow v\\) reads "the expression \\(e\\) evaluates to the value \\(v\\).
Using this notation, our rules start to look like the following:
{{< latex >}}
\frac
{n \in \mathbb{N}}
{n \Downarrow n}
\quad
\frac
{e_1 \Downarrow n_1 \quad e_2 \Downarrow n_2}
{e_1 \square e_2 \Downarrow n_1 + n_2}
{{< /latex >}}
If nothing else, these are way more compact! Though these may look intimidating at first, it helps to
simply read each symbol as its English meaning.
#### Encoding Inference Rules in Coq
Now that we've seen what inference rules are, we can take a look at how they can be represented in Coq.
We can use Coq's `Inductive` mechanism to define the rules. Let's start with our "is even" property.
```Coq
Inductive is_even : nat -> Prop :=
| zero_even : is_even 0
| plustwo_even : is_even n -> is_even (n+2).
```
The first line declares the property `is_even`, which, given a natural number, returns proposition.
This means that `is_even` is not a proposition itself, but `is_even 0`, `is_even 1`, and `is_even 2`
are all propositions.
The following two lines each encode one of our aforementioned inference rules. The first rule, `zero_even`,
is of type `is_even 0`. The `zero_even` rule doesn't require any arguments, and we can use it to create
a proof that 0 is even. On the other hand, the `plustwo_even` rule _does_ require an argument, `is_even n`.
To construct a proof that a number `n+2` is even using `plustwo_even`, we need to provide a proof
that `n` itself is even. From this definition we can see a general principle: we encode each inference
rule as constructor of an inductive Coq type. Each rule encoded in this manner takes as arguments
the proofs of its premises, and returns a proof of its conclusion.
For another example, let's encode our simple addition language. First, we have to define the language
itself:
```Coq
Inductive tinylang : Type :=
| number (n : nat) : tinylang
| box (e1 e2 : tinylang) : tinylang.
```
This defines the two elements of our example language: `number n` corresponds to \\(n\\), and `box e1 e2` corresponds
to \\(e_1 \square e_2\\). Finally, we define the inference rules:
```Coq {linenos=true}
Inductive tinylang_sem : tinylang -> nat -> Prop :=
| number_sem : forall (n : nat), tinylang_sem (number n) n
| box_sem : forall (e1 e2 : tinylang) (n1 n2 : nat),
tinylang_sem e1 n1 -> tinylang_sem e2 n2 ->
tinylang_sem (box e1 e2) (n1 + n2).
```
When we wrote our rules earlier, by using arbitrary variables like \\(e_1\\) and \\(n_1\\), we implicitly meant
that our rules work for _any_ number or expression. When writing Coq we have to make this assumption explicit
by using `forall`. For instance, the rule on line 2 reads, "for any number `n`, the expression `n` evaluates to `n`".
#### Semantics of Our Language
We've now written some example big-step operational semantics, both "on paper" and in Coq. Now, it's time to take a look at
the specific semantics of the language from Day 8! Our language consists of a few parts.
First, there are three opcodes: \\(\texttt{jmp}\\), \\(\\texttt{nop}\\), and \\(\\texttt{add}\\). Opcodes, combined
with an integer, make up an instruction. For example, the instruction \\(\\texttt{add} \\; 3\\) will increase the
content of the accumulator by three. Finally, a program consists of a sequence of instructions; They're separated
by newlines in the puzzle input, but we'll instead separate them by semicolons. For example, here's a complete program.
{{< latex >}}
\texttt{add} \; 0; \; \texttt{nop} \; 2; \; \texttt{jmp} \; -2
{{< /latex >}}
Now, let's try evaluating this program. Starting at the beginning and with 0 in the accumulator,
it will add 0 to the accumulator (keeping it the same),
do nothing, and finally jump back to the beginning. At this point, it will try to run the addition instruction again,
which is not allowed; thus, the program will terminate.
Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote \\(p\\)).
* First, to evaluate the program we will need a program counter, \\(\\textit{c}\\). This program counter
will tell us the position of the instruction to be executed next. It can also point past the last instruction,
which means our program terminated successfully.
* Next, we'll need the accumulator \\(a\\). Addition instructions can change the accumulator, and we will be interested
in the number that ends up in the accumulator when our program finishes executing.
* Finally, and more subtly, we'll need to keep track of the states we visited. For instance,
in the course of evaluating our program above, we encounter the \\((c, a)\\) pair of \\((0, 0)\\) twice: once
at the beginning, and once at the end. However, whereas at the beginning we have not yet encountered the addition
instruction, at the end we have, so the evaluation behaves differently. To make the proofs work better in Coq,
we'll use a set \\(v\\) of
{{< sidenote "right" "allowed-note" "allowed (valid) program counters (as opposed to visited program counters)." >}}
Whereas the set of "visited" program counters keeps growing as our evaluation continues,
the set of "allowed" program counters keeps shrinking. Because the "allowed" set never stops shrinking,
assuming we're starting with a finite set, our execution will eventually terminate.
{{< /sidenote >}}
Now we have all the elements of our evaluation. Let's define some notation. A program starts at some state,
and terminates in another, possibly different state. In the course of a regular evaluation, the program
never changes; only the state does. So I propose this (rather unorthodox) notation:
{{< latex >}}
(c, a, v) \Rightarrow_p (c', a', v')
{{< /latex >}}
This reads, "after starting at program counter \\(c\\), accumulator \\(a\\), and set of valid addresses \\(v\\),
the program \\(p\\) terminates with program counter \\(c'\\), accumulator \\(a'\\), and set of valid addresses \\(v'\\)".
Before creating the inference rules for this evaluation relation, let's define the effect of evaluating a single
instruction, using notation \\((c, a) \rightarrow_i (c', a')\\). An addition instruction changes the accumulator,
and increases the program counter by 1.
{{< latex >}}
\frac{}
{(c, a) \rightarrow_{\texttt{add} \; n} (c+1, a+n)}
{{< /latex >}}
A no-op instruction does even less. All it does is increment the program counter.
{{< latex >}}
\frac{}
{(c, a) \rightarrow_{\texttt{nop} \; n} (c+1, a)}
{{< /latex >}}
Finally, a jump instruction leaves the accumulator intact, but adds a number to the program counter itself!
{{< latex >}}
\frac{}
{(c, a) \rightarrow_{\texttt{jmp} \; n} (c+n, a)}
{{< /latex >}}
None of these rules have any premises, and they really are quite simple. Now, let's define the rules
for evaluating a program. First of all, a program starting in a state that is not considered "valid"
is done evaluating, and is in a "failed" state.
{{< latex >}}
\frac{c \not \in v \quad c \not= \text{length}(p)}
{(c, a, v) \Rightarrow_{p} (c, a, v)}
{{< /latex >}}
We use \\(\\text{length}(p)\\) to represent the number of instructions in \\(p\\). Note the second premise:
even if our program counter \\(c\\) is not included in the valid set, if it's "past the end of the program",
the program terminates in an "ok" state. Here's a rule for terminating in the "ok" state:
{{< latex >}}
\frac{c = \text{length}(p)}
{(c, a, v) \Rightarrow_{p} (c, a, v)}
{{< /latex >}}
When our program counter reaches the end of the program, we are also done evaluating it. Even though
both rules {{< sidenote "right" "redundant-note" "lead to the same conclusion," >}}
In fact, if the end of the program is never included in the valid set, the second rule is completely redundant.
{{< /sidenote >}}
it helps to distinguish the two possible outcomes. Finally, if neither of the termination conditions are met,
our program can take a step, and continue evaluating from there.
{{< latex >}}
\frac{c \in v \quad p[c] = i \quad (c, a) \rightarrow_i (c', a') \quad (c', a', v - \{c\}) \Rightarrow_p (c'', a'', v'')}
{(c, a, v) \Rightarrow_{p} (c'', a'', v'')}
{{< /latex >}}
This is quite a rule. A lot of things need to work out for a program to evauate from a state that isn't
currently the final state:
* The current program counter \\(c\\) must be valid. That is, it must be an element of \\(v\\).
* This program counter must correspond to an instruction \\(i\\) in \\(p\\), which we write as \\(p[c] = i\\).
* This instruction must be executed, changing our program counter from \\(c\\) to \\(c'\\) and our
accumulator from \\(a\\) to \\(a'\\). The set of valid instructions will no longer include \\(c\\),
and will become \\(v - \\{c\\}\\).
* Our program must then finish executing, starting at state
\\((c', a', v - \\{c\\})\\), and ending in some (unknown) state \\((c'', a'', v'')\\).
If all of these conditions are met, our program, starting at \\((c, a, v)\\), will terminate in the state \\((c'', a'', v'')\\). This third rule completes our semantics; a program being executed will keep running instructions using the third rule, until it finally
hits an invalid program counter (terminating with the first rule) or gets to the end of the program (terminating with the second rule).
Loading…
Cancel
Save