84 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			84 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| ---
 | ||
| title: "I Don't Like Coq's Documentation"
 | ||
| date: 2021-11-24T21:48:59-08:00
 | ||
| draft: true
 | ||
| tags: ["Coq"]
 | ||
| ---
 | ||
| 
 | ||
| Recently, I wrote an article on [Formalizing Dawn's Core Calculus in Coq]({{< relref "./coq_dawn.md" >}}).
 | ||
| One of the proofs (specifically, correctness of \\(\\text{quote}_3\\)) was the best candidate I've ever
 | ||
| encountered for proof automation. I knew that proof automation was possible from the second book of Software
 | ||
| Foundations, [Programming Language Foundations](https://softwarefoundations.cis.upenn.edu/plf-current/index.html).
 | ||
| I went there to learn more, and started my little journey in picking up Coq's `Ltac2`.
 | ||
| 
 | ||
| Before I go any further, let me say that I'd self-describe as an "advanced beginner" in Coq, maybe "intermediate"
 | ||
| on a good day. I am certainly far from a master. I will also say that I am quite young, and thus possibly spoiled
 | ||
| by the explosion of well-documented languages, tools, and libraries. I don't frequently check `man` pages, and I don't
 | ||
| often read straight up technical manuals. Maybe the fault lies with me.
 | ||
| Nevertheles, I feel like I am where I am in the process of learning Coq
 | ||
| in part because of the state of its learning resources.
 | ||
| As a case study, let's take my attempt to automate away a pretty simple proof.
 | ||
| 
 | ||
| ### Grammars instead of Examples
 | ||
| I did not specifically remember the chapter of Software Foundation in which tactics were introduced.
 | ||
| Instead of skimming through chapters until I found it, I tried to look up "coq custom tactic". The
 | ||
| first thing that comes up is the page for `Ltac`.
 | ||
| 
 | ||
| After a brief explanation of what `Ltac` is, the documentation jumps straight into the grammar of the entire
 | ||
| tactic language. Here' a screenshot of what that looks like:
 | ||
| 
 | ||
| {{< figure src="ltac_grammar.png" caption="The grammar of Ltac from the Coq page." class="large" alt="A grammar for the `Ltac` language within Coq. The grammar is in Backus–Naur form, and has 9 nonterminals." >}}
 | ||
| 
 | ||
| Good old Backus-Naur form. Now, my two main areas of interest are programming language theory and compilers, and so I'm no stranger to BNF. In fact, the first
 | ||
| time I saw such a documentation page (most pages describing Coq language feature have some production rules), I didn't even consciously process that I was looking
 | ||
| at grammar rules. However, and despite CompCert (a compiler) being probably the most well known example of a project in Coq, I don't think that Coq is made _just_
 | ||
| for people familiar with PLT or compilers. I don't think it should be expected for the average newcomer to Coq (or the average beginner-ish person like me) to know how to read production rules, or to know
 | ||
| what terminals and nonterminals are. Logical Founadtions sure managed to explain Gallina without resorting to BNFs.
 | ||
| 
 | ||
| And even if I, as a newcomer, know what BNF is, and how to read the rules, there's another layer to this specification:
 | ||
| the precedence of various operators is encoded in the BNF rules. This is a pretty common pattern
 | ||
| for writing down programming language grammars; for each level of operator precedence, there's another
 | ||
| nonterminal. We have `ltac_expr4` for sequencing (the least binding operator), and `ltac_expr3` for "level 3 tactics",
 | ||
| `ltac_expr2` for addition, logical "or", and "level 2 tactics". The creators of this documentation page clearly knew
 | ||
| what they were getting at here, and I've seen this pattern enough times to recognize it right away. But if you
 | ||
| _haven't_ seen this pattern before (and why should you have?), you'll need to take some time to decipher the rules.
 | ||
| That's time that you'd rather spend trying to write your tactic.
 | ||
| 
 | ||
| The page could just as well have mentioned the types of constructs in `Ltac`, and given a table of their relative precedence.
 | ||
| This could be an opportunity to give an example of what a program in `Ltac` looks like. Instead, despite having seen all of these nonterminals,
 | ||
| I still don't have an image in my mind's eye of what the language looks like. And better yet, I think that the grammar's incorrect:
 | ||
| 
 | ||
| ```
 | ||
| ltac_expr4 ::= ltac_expr3 ; ltac_expr3|binder_tactic
 | ||
| ```
 | ||
| 
 | ||
| The way this is written, there's no way to sequence (using a semicolon) more than two things. The semicolon
 | ||
| only occurs on level four, and both nonterminals in this rule are level three. However, Coq is perfectly happy
 | ||
| to accept the following:
 | ||
| 
 | ||
| ```Coq
 | ||
| Ltac test := auto; auto; auto.
 | ||
| ```
 | ||
| 
 | ||
| In the `Ltac2` grammar, this is written the way I'd expect:
 | ||
| 
 | ||
| ```
 | ||
| ltac2_expr ::= ltac2_expr5 ; ltac2_expr
 | ||
| ```
 | ||
| 
 | ||
| Let's do a quick recap. We have an encoding that requires a degree of experience with grammars and
 | ||
| programming languages to be useful to the reader, _and_ this encoding
 | ||
| {{< sidenote "right" "errors-note" "leaves room for errors," >}}
 | ||
| Here's a question to ponder: how come this error has gone unnoticed? Surely
 | ||
| people used this page to learn <code>Ltac</code>. I believe that part of the reason is
 | ||
| pattern matching: an experienced reader will recognize the "precedence trick", and quickly
 | ||
| scan the grammar levels to estblish precedence. The people writing and proofreading this
 | ||
| documentation likely read it this way, too.
 | ||
| {{< /sidenote >}}
 | ||
| errors that _actually appear in practice_.
 | ||
| {{< sidenote "left" "achilles-note" "We have a very precise, yet incorrect definition." >}}
 | ||
| Amusingly, I think this is very close to what is considered the achilles heel of formal verification:
 | ||
| software that precisely adheres to an incorrect or incomplete specification.
 | ||
| {{< /sidenote >}}
 | ||
| Somehow, "a sequence of statements separated by a semicolon" seems like a simpler explanation.
 |