84 lines
5.4 KiB
Markdown
84 lines
5.4 KiB
Markdown
|
---
|
|||
|
title: "I Don't Like Coq's Documentation"
|
|||
|
date: 2021-11-24T21:48:59-08:00
|
|||
|
draft: true
|
|||
|
tags: ["Coq"]
|
|||
|
---
|
|||
|
|
|||
|
Recently, I wrote an article on [Formalizing Dawn's Core Calculus in Coq]({{< relref "./coq_dawn.md" >}}).
|
|||
|
One of the proofs (specifically, correctness of \\(\\text{quote}_3\\)) was the best candidate I've ever
|
|||
|
encountered for proof automation. I knew that proof automation was possible from the second book of Software
|
|||
|
Foundations, [Programming Language Foundations](https://softwarefoundations.cis.upenn.edu/plf-current/index.html).
|
|||
|
I went there to learn more, and started my little journey in picking up Coq's `Ltac2`.
|
|||
|
|
|||
|
Before I go any further, let me say that I'd self-describe as an "advanced beginner" in Coq, maybe "intermediate"
|
|||
|
on a good day. I am certainly far from a master. I will also say that I am quite young, and thus possibly spoiled
|
|||
|
by the explosion of well-documented languages, tools, and libraries. I don't frequently check `man` pages, and I don't
|
|||
|
often read straight up technical manuals. Maybe the fault lies with me.
|
|||
|
Nevertheles, I feel like I am where I am in the process of learning Coq
|
|||
|
in part because of the state of its learning resources.
|
|||
|
As a case study, let's take my attempt to automate away a pretty simple proof.
|
|||
|
|
|||
|
### Grammars instead of Examples
|
|||
|
I did not specifically remember the chapter of Software Foundation in which tactics were introduced.
|
|||
|
Instead of skimming through chapters until I found it, I tried to look up "coq custom tactic". The
|
|||
|
first thing that comes up is the page for `Ltac`.
|
|||
|
|
|||
|
After a brief explanation of what `Ltac` is, the documentation jumps straight into the grammar of the entire
|
|||
|
tactic language. Here' a screenshot of what that looks like:
|
|||
|
|
|||
|
{{< figure src="ltac_grammar.png" caption="The grammar of Ltac from the Coq page." class="large" alt="A grammar for the `Ltac` language within Coq. The grammar is in Backus–Naur form, and has 9 nonterminals." >}}
|
|||
|
|
|||
|
Good old Backus-Naur form. Now, my two main areas of interest are programming language theory and compilers, and so I'm no stranger to BNF. In fact, the first
|
|||
|
time I saw such a documentation page (most pages describing Coq language feature have some production rules), I didn't even consciously process that I was looking
|
|||
|
at grammar rules. However, and despite CompCert (a compiler) being probably the most well known example of a project in Coq, I don't think that Coq is made _just_
|
|||
|
for people familiar with PLT or compilers. I don't think it should be expected for the average newcomer to Coq (or the average beginner-ish person like me) to know how to read production rules, or to know
|
|||
|
what terminals and nonterminals are. Logical Founadtions sure managed to explain Gallina without resorting to BNFs.
|
|||
|
|
|||
|
And even if I, as a newcomer, know what BNF is, and how to read the rules, there's another layer to this specification:
|
|||
|
the precedence of various operators is encoded in the BNF rules. This is a pretty common pattern
|
|||
|
for writing down programming language grammars; for each level of operator precedence, there's another
|
|||
|
nonterminal. We have `ltac_expr4` for sequencing (the least binding operator), and `ltac_expr3` for "level 3 tactics",
|
|||
|
`ltac_expr2` for addition, logical "or", and "level 2 tactics". The creators of this documentation page clearly knew
|
|||
|
what they were getting at here, and I've seen this pattern enough times to recognize it right away. But if you
|
|||
|
_haven't_ seen this pattern before (and why should you have?), you'll need to take some time to decipher the rules.
|
|||
|
That's time that you'd rather spend trying to write your tactic.
|
|||
|
|
|||
|
The page could just as well have mentioned the types of constructs in `Ltac`, and given a table of their relative precedence.
|
|||
|
This could be an opportunity to give an example of what a program in `Ltac` looks like. Instead, despite having seen all of these nonterminals,
|
|||
|
I still don't have an image in my mind's eye of what the language looks like. And better yet, I think that the grammar's incorrect:
|
|||
|
|
|||
|
```
|
|||
|
ltac_expr4 ::= ltac_expr3 ; ltac_expr3|binder_tactic
|
|||
|
```
|
|||
|
|
|||
|
The way this is written, there's no way to sequence (using a semicolon) more than two things. The semicolon
|
|||
|
only occurs on level four, and both nonterminals in this rule are level three. However, Coq is perfectly happy
|
|||
|
to accept the following:
|
|||
|
|
|||
|
```Coq
|
|||
|
Ltac test := auto; auto; auto.
|
|||
|
```
|
|||
|
|
|||
|
In the `Ltac2` grammar, this is written the way I'd expect:
|
|||
|
|
|||
|
```
|
|||
|
ltac2_expr ::= ltac2_expr5 ; ltac2_expr
|
|||
|
```
|
|||
|
|
|||
|
Let's do a quick recap. We have an encoding that requires a degree of experience with grammars and
|
|||
|
programming languages to be useful to the reader, _and_ this encoding
|
|||
|
{{< sidenote "right" "errors-note" "leaves room for errors," >}}
|
|||
|
Here's a question to ponder: how come this error has gone unnoticed? Surely
|
|||
|
people used this page to learn <code>Ltac</code>. I believe that part of the reason is
|
|||
|
pattern matching: an experienced reader will recognize the "precedence trick", and quickly
|
|||
|
scan the grammar levels to estblish precedence. The people writing and proofreading this
|
|||
|
documentation likely read it this way, too.
|
|||
|
{{< /sidenote >}}
|
|||
|
errors that _actually appear in practice_.
|
|||
|
{{< sidenote "left" "achilles-note" "We have a very precise, yet incorrect definition." >}}
|
|||
|
Amusingly, I think this is very close to what is considered the achilles heel of formal verification:
|
|||
|
software that precisely adheres to an incorrect or incomplete specification.
|
|||
|
{{< /sidenote >}}
|
|||
|
Somehow, "a sequence of statements separated by a semicolon" seems like a simpler explanation.
|