5.4 KiB
title | date | expirydate | draft | tags | |
---|---|---|---|---|---|
I Don't Like Coq's Documentation | 2021-11-24T21:48:59-08:00 | 2021-11-24T21:48:59-08:00 | true |
|
Recently, I wrote an article on [Formalizing Dawn's Core Calculus in Coq]({{< relref "./coq_dawn.md" >}}).
One of the proofs (specifically, correctness of \text{quote}_3
) was the best candidate I've ever
encountered for proof automation. I knew that proof automation was possible from the second book of Software
Foundations, Programming Language Foundations.
I went there to learn more, and started my little journey in picking up Coq's Ltac2
.
Before I go any further, let me say that I'd self-describe as an "advanced beginner" in Coq, maybe "intermediate"
on a good day. I am certainly far from a master. I will also say that I am quite young, and thus possibly spoiled
by the explosion of well-documented languages, tools, and libraries. I don't frequently check man
pages, and I don't
often read straight up technical manuals. Maybe the fault lies with me.
Nevertheles, I feel like I am where I am in the process of learning Coq
in part because of the state of its learning resources.
As a case study, let's take my attempt to automate away a pretty simple proof.
Grammars instead of Examples
I did not specifically remember the chapter of Software Foundation in which tactics were introduced.
Instead of skimming through chapters until I found it, I tried to look up "coq custom tactic". The
first thing that comes up is the page for Ltac
.
After a brief explanation of what Ltac
is, the documentation jumps straight into the grammar of the entire
tactic language. Here' a screenshot of what that looks like:
{{< figure src="ltac_grammar.png" caption="The grammar of Ltac from the Coq page." class="large" alt="A grammar for the Ltac
language within Coq. The grammar is in Backus–Naur form, and has 9 nonterminals." >}}
Good old Backus-Naur form. Now, my two main areas of interest are programming language theory and compilers, and so I'm no stranger to BNF. In fact, the first time I saw such a documentation page (most pages describing Coq language feature have some production rules), I didn't even consciously process that I was looking at grammar rules. However, and despite CompCert (a compiler) being probably the most well known example of a project in Coq, I don't think that Coq is made just for people familiar with PLT or compilers. I don't think it should be expected for the average newcomer to Coq (or the average beginner-ish person like me) to know how to read production rules, or to know what terminals and nonterminals are. Logical Founadtions sure managed to explain Gallina without resorting to BNFs.
And even if I, as a newcomer, know what BNF is, and how to read the rules, there's another layer to this specification:
the precedence of various operators is encoded in the BNF rules. This is a pretty common pattern
for writing down programming language grammars; for each level of operator precedence, there's another
nonterminal. We have ltac_expr4
for sequencing (the least binding operator), and ltac_expr3
for "level 3 tactics",
ltac_expr2
for addition, logical "or", and "level 2 tactics". The creators of this documentation page clearly knew
what they were getting at here, and I've seen this pattern enough times to recognize it right away. But if you
haven't seen this pattern before (and why should you have?), you'll need to take some time to decipher the rules.
That's time that you'd rather spend trying to write your tactic.
The page could just as well have mentioned the types of constructs in Ltac
, and given a table of their relative precedence.
This could be an opportunity to give an example of what a program in Ltac
looks like. Instead, despite having seen all of these nonterminals,
I still don't have an image in my mind's eye of what the language looks like. And better yet, I think that the grammar's incorrect:
ltac_expr4 ::= ltac_expr3 ; ltac_expr3|binder_tactic
The way this is written, there's no way to sequence (using a semicolon) more than two things. The semicolon only occurs on level four, and both nonterminals in this rule are level three. However, Coq is perfectly happy to accept the following:
Ltac test := auto; auto; auto.
In the Ltac2
grammar, this is written the way I'd expect:
ltac2_expr ::= ltac2_expr5 ; ltac2_expr
Let's do a quick recap. We have an encoding that requires a degree of experience with grammars and
programming languages to be useful to the reader, and this encoding
{{< sidenote "right" "errors-note" "leaves room for errors," >}}
Here's a question to ponder: how come this error has gone unnoticed? Surely
people used this page to learn Ltac
. I believe that part of the reason is
pattern matching: an experienced reader will recognize the "precedence trick", and quickly
scan the grammar levels to estblish precedence. The people writing and proofreading this
documentation likely read it this way, too.
{{< /sidenote >}}
errors that actually appear in practice.
{{< sidenote "left" "achilles-note" "We have a very precise, yet incorrect definition." >}}
Amusingly, I think this is very close to what is considered the achilles heel of formal verification:
software that precisely adheres to an incorrect or incomplete specification.
{{< /sidenote >}}
Somehow, "a sequence of statements separated by a semicolon" seems like a simpler explanation.