blog-static/content/blog/better_explanations.md

103 lines
5.7 KiB
Markdown
Raw Normal View History

2019-10-12 13:12:18 -07:00
---
title: "Thoughts on Better Explanations"
date: 2019-10-12T00:33:02-07:00
tags: ["Language Server Protocol"]
---
How do you explain how to write a program?
Instructional material is becoming more and more popular on the web, with
thousands of programming tutorials for languages, frameworks,
and technologies created on YouTube, Medium, and peole's
personal sites. And yet, there seem to be little standardization or
progress towards an "effective" way. Everyone is pasting code
examples, showing gists, or even sharing whole projects on GitHub.
When I was writing the earliest posts on this site, I did the same.
Write some code, copy paste it, be done. Write some code, link it,
be done. If I'm feeling fancy, write some code, gist it, be done.
It's not unlikely for code presented in this way
to become outdated and dysfunctional.
I discovered a whole new perspective when going through
[Software Foundations](https://softwarefoundations.cis.upenn.edu/). What's
different about that book is that the line between source code and instructional
text is blurred - the HTML is generated from the comments in the Coq file, and
code from the Coq file is included as snippets in the book. Rather than
having readers piece together the snippets from the HTML, it simply directed
them to the Coq file from which the page was generated. It maintained
both the benefits of a live code example, and of a textbook written to teach,
not to simply explain what the code does.
This is reminiscent of [Literate Programming](https://en.wikipedia.org/wiki/Literate_programming),
a style of programming in which the explanation of the program, in human-oriented order, is presented,
with code as supporting material. Tools such as CWEB implement Literate Programming, allowing
users to write files that are then converted into C source, and can be compiled as usual. I was intrigued
by the idea, but in all honesty, found it lacking.
For one, there is the problem of an extra processing step. Compilers are written to compile C, and not
CWEB files. Thus, a program must take CWEB source, convert it to C, and then a compiler must
convert the C code to machine language. This doesn't feel elegant - you're effectively
stripping the CWEB source files of the text you added to them. In technical terms, it's not really
that big of an issue - software build systems already have support for multiple processing steps,
and it would be hard to CWEB a piece of software large enough that the intermediate step will cause problems.
Another issue is the lack of universality. CWEB is specialized for C. WEB, the original literate programming
tool, is specialized for Pascal. There's tools that are language agnostic, of course, such as noweb. But
the [Wikipedia page for noweb](https://en.wikipedia.org/wiki/Noweb) drops this bomb:
> noweb defines a specific file format and a file is likely to interleave three different formats
> (noweb, latex and the language used for the software). This is not recognised by other software development
> tools and consequently using noweb excludes the use of UML or code documentation tools.
This may be the worst trade deal in the history of trade deals, maybe ever! By trying to explain how our
code works, __we sacrifce all other tooling__. Worse, because Literal Programming encourages presenting
code in fragments and out of order, it is particularly difficult to reason about programs in an automated
setting.
When I present code to a reader, I want to write it with the use of existing tooling. I want my syntax
highlighting. I want my linting. I want my build system. And in the same way, a user who is reading
my code wants to be able to view it, change it, experiment with it. Furthermore, though, I want
to be able to guide the reader's attention. Text-in-comments works great for Coq, but other languages like
C++, in which the order of declarations matters, may not be as suited for such an approach.
In essense, I want:
* The power of language-specific tooling, without having to extend the tooling itself
* A universal way of describing a program in any language
* A way of maintaining synchrony between the explanation and the source
I have an idea of a piece of software that can do such a thing.
### A Language Server Based Tool
It is a well known problem that various editors support different languages
with mixed success. The idea of the Language Server Protocol is to allow
for a program (the server) to be in charge of making sense of the code, and then
communicate the results to an editor. The editor, in that case,
doesn't have to do as much heavy lifting, and instead just queries
the language server when it needs information.
While this technology is used for text editors, I think it can
be adapted to educational texts that reference a particular
codebase. I envision the following workflow:
1. An author writes their tutorial/book/blog post
in their markup language of choice (Markdown).
2. They reference a fragment of code (a function, a variable)
through a specialized syntax.
3. When the HTML/LaTeX output is created, a language server
is started. The language server uses information from
the references in step 2 to insert code fragments into
the generated output.
After each "conversion" of source text to HTML/LaTeX, the
code in the generated snippets will be in sync with the codebase.
At the same time, changing the source text will not require changing
the source files. Finally, since language servers exist for most
established languages, this sytem can work nearly out of the box,
and even be added to established projects with no changes to the projects
themselves.
Of course, this is just a rough idea. I'm not sure how plausible it is
to include snippets with the use of Language Server Protocol. But
I certainly would like to try!