Start on a draft about Agda and Hugo
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
04f12b545d
commit
6a168f2fe1
256
content/blog/agda_hugo.md
Normal file
256
content/blog/agda_hugo.md
Normal file
|
@ -0,0 +1,256 @@
|
||||||
|
---
|
||||||
|
title: "Integrating Agda's HTML Output with Hugo"
|
||||||
|
date: 2024-05-25T21:02:10-07:00
|
||||||
|
draft: true
|
||||||
|
tags: ["Agda", "Hugo", "Ruby", "Nix"]
|
||||||
|
---
|
||||||
|
|
||||||
|
One of my favorite things about Agda are its clickable HTML pages. If you don't
|
||||||
|
know what they are, that's pages like [`Data.List.Properties`](https://agda.github.io/agda-stdlib/master/Data.List.Properties.html);
|
||||||
|
they just give the code from a particular Agda file, but make every identifier
|
||||||
|
clickable. Then, if you see some variable or function that you don't know, you
|
||||||
|
can just click it and jump right to it! It makes exploring the documentation
|
||||||
|
a lot smoother. I've found that these HTML pages provide all the information
|
||||||
|
I need for writing proofs.
|
||||||
|
|
||||||
|
Recently, I've been writing a fair bit about Agda; mostly about the patterns
|
||||||
|
that I've learned about, such as the ["is something" pattern]({{< relref "agda_is_pattern" >}})
|
||||||
|
and the ["deeply embedded expression" trick]({{< relref "agda_expr_pattern" >}}).
|
||||||
|
I've found myself wanting to click on definitions in my own code blocks; recently,
|
||||||
|
I got this working, and I wanted to share how I did it, in case someone else
|
||||||
|
wants to integrate Agda into their own static website. Though my stack
|
||||||
|
is based on Hugo, the general idea should work with any other static site
|
||||||
|
generator.
|
||||||
|
|
||||||
|
### TL;DR and Demo
|
||||||
|
|
||||||
|
I wrote a script to transfer links from an Agda HTML file into Hugo's HTML
|
||||||
|
output, making it possible to embellish "plain" Hugo output with Agda's
|
||||||
|
'go-to-definition links'. It looks like this. Here's an Agda code block
|
||||||
|
defining an 'expression' data type, from a project of mine:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 543 546 >}}
|
||||||
|
|
||||||
|
And here's the denotational semantics for that expression:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 586 589 >}}
|
||||||
|
|
||||||
|
Notice that you can click `Expr`, `_∪_`, `⟦`, etc.! All of this integrates
|
||||||
|
with my existing Hugo site, and only required a little bit of additional
|
||||||
|
metadata to make it work.
|
||||||
|
|
||||||
|
Now, the details. Right now, the solution is pretty tailored to my site
|
||||||
|
and workflow, but the core of the script -- a piece that transfers links
|
||||||
|
from an Agda HTML file into a syntax highlighted Hugo HTML block -- should
|
||||||
|
be fairly reusable.
|
||||||
|
|
||||||
|
### The Constraints
|
||||||
|
The goal was simple: to allow the code blocks on my Hugo-generated site to
|
||||||
|
have links that take the user to the definition of a given symbol.
|
||||||
|
Specifically, if the symbol occurs somewhere on the same blog page, the link
|
||||||
|
should take the user there (and not to a regular `Module.html` file). That
|
||||||
|
way, the reader can not only get to the code that they want to see, but also
|
||||||
|
have a chance to read the surrounding prose in properly-rendered Markdown.
|
||||||
|
|
||||||
|
Next, unlike standard "literate Agda" files, my blog posts are not single
|
||||||
|
`.agda` files with Markdown in comments. Rather, I use regular Hugo
|
||||||
|
Markdown, and present portions of an existing project, weaving together many
|
||||||
|
files, and showing the fragments out of order. So, my tool needs to support
|
||||||
|
links that come from distinct modules, in any order.
|
||||||
|
|
||||||
|
Additionally, I've recently been writing a whole series about an Agda project
|
||||||
|
of mine; in this series, I gradually build up to the final product, explaining
|
||||||
|
one or two modules at a time. I would expect that links on pages in this series
|
||||||
|
could jump to other pages in the same series: if I cover module `A` in part 1,
|
||||||
|
then write `A.f` in part 2, clicking on `A` -- and maybe `f` -- should take
|
||||||
|
the reader back to the first part's page; once again, this would help provide
|
||||||
|
them with the surrounding explanation.
|
||||||
|
|
||||||
|
Finally, I wanted the Agda code to appear exactly the same as any other code
|
||||||
|
on my site, including the Hugo-provided syntax highlighting and theme. This
|
||||||
|
ruled out just copy-pasting pieces of the Agda-generated HTML in place of
|
||||||
|
code blocks on my page (and redirecting the links). Thought it was not
|
||||||
|
a hard requirement, I also hoped to include Agda code in the same
|
||||||
|
manner that I include all other code: [my `codelines` shortcode]({{< relref "codelines" >}}).
|
||||||
|
In brief, the `codelines` shortcode creates a syntax-highlighted code block,
|
||||||
|
as well as a surrounding "context" that says what file the code is from,
|
||||||
|
which lines are listed, and where to find the full code (e.g., on my Git server).
|
||||||
|
It looks something like this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language.agda" 20 27 >}}
|
||||||
|
|
||||||
|
In summary:
|
||||||
|
|
||||||
|
1. I want to create cross-links between symbols in Agda blocks in a blog post.
|
||||||
|
2. These code blocks could include code from disjoint files, and be out of order.
|
||||||
|
3. Code blocks among a whole series of posts should be cross-linked too.
|
||||||
|
4. The code blocks should be syntax highlighted the same way as the rest of the
|
||||||
|
code on the site.
|
||||||
|
5. Ideally, I should be able to use my regular method for referencing code.
|
||||||
|
|
||||||
|
I've hit all of these requirements; now it's time to dig into how I got there.
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
It's pretty much a no-go to try to resolve Agda from Hugo, or perform some
|
||||||
|
sort of "heuristic" to detect cross-links. Agda is a very complex programming
|
||||||
|
language, and Hugo's templating engine, though powerful, is just not
|
||||||
|
up to this task. Fortunately, Agda has support for
|
||||||
|
[HTML output using the `--html` flag](https://agda.readthedocs.io/en/v2.6.4.3-r1/tools/generating-html.html).
|
||||||
|
As a build step, I can invoke Agda on files that are referenced by my blog,
|
||||||
|
and generate HTML. This would decidedly slow down the site build process,
|
||||||
|
but it would guarantee accurate link information.
|
||||||
|
|
||||||
|
|
||||||
|
On the other hand, to satisfy the 4th constraint, I need to somehow mimic --
|
||||||
|
or keep -- the format of Hugo's existing HTML output. The easiest way to
|
||||||
|
do this without worrying about breaking changes and version incompatibility
|
||||||
|
is to actually use the existing syntax-highlighted HTML, and annotate it
|
||||||
|
with links as I discover them. Effectively, what I need to do is a "link
|
||||||
|
transfer": I need to identify regions of code that are highlighted in Agda's HTML,
|
||||||
|
find those regions in Hugo's HTML output, and mark them with links. In addition,
|
||||||
|
I'll need to fix up the links themselves: the HTML output assumes that each
|
||||||
|
Agda file is its own HTML page, but this is ruled out by the second constraint.
|
||||||
|
|
||||||
|
As a little visualization, the overall problems looks something like this:
|
||||||
|
|
||||||
|
````Agda {linenos=table}
|
||||||
|
-- Agda's HTML output (blocks of 't' are links):
|
||||||
|
-- |tttttt| |tttt| |t| |t| |ttttt|
|
||||||
|
module ModX ( x : T ) where
|
||||||
|
-- |tttttt| |tt|t| |t| |t| |ttttt|
|
||||||
|
-- Hugo's HTML output (blocks of 't' are syntax highlighting spans)
|
||||||
|
````
|
||||||
|
|
||||||
|
Both Agda and Hugo output a preformatted code block, decorated with various
|
||||||
|
inline HTMl that indicates information (token color for Hugo; symbol IDs and
|
||||||
|
links in Agda). However, Agda and Hugo do not use the same process to create
|
||||||
|
this decorated output; it's entirely possible -- and not uncommon -- for
|
||||||
|
Hugo and Agda to produce misaligned HTML nodes. In my diagram above,
|
||||||
|
this is reflected as `ModX` being considered a single token by Agda, but
|
||||||
|
two tokens (`Mod` and `X`) by the syntax highlighter. As a result, it's
|
||||||
|
difficult to naively iterate the two HTML formats in parallel.
|
||||||
|
|
||||||
|
What I ended up doing is translating Agda's HTML output into offsets and data
|
||||||
|
about the code block's _plain text_ -- the source code being decorated.
|
||||||
|
Both the Agda and Hugo HTML describe the same code; thus, the plain text
|
||||||
|
is the common denominator between the two.
|
||||||
|
|
||||||
|
I wrote a Ruby script to extract the decorations from the Agda output; here
|
||||||
|
it is in slightly abridged form. You can find the [original `agda.rb` file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/04f12b545d5692a78b1a2f13ef968417c749e295/agda.rb).
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# Traverse the preformatted Agda block in the given Agda HTML file
|
||||||
|
# and find which textual ranges have IDs and links to other ranges.
|
||||||
|
# Store this information in a hash, line => links[]
|
||||||
|
def process_agda_html_file(file)
|
||||||
|
document = Nokogiri::HTML.parse(File.open(file))
|
||||||
|
pre_code = document.css("pre.Agda")[0]
|
||||||
|
|
||||||
|
# The traversal postorder; we always visit children before their
|
||||||
|
# parents, and we visit leaves in sequence.
|
||||||
|
line_infos = []
|
||||||
|
offset = 0 # Column index within the current Agda source code line
|
||||||
|
line = 1
|
||||||
|
pre_code.traverse do |at|
|
||||||
|
# Text nodes are always leaves; visiting a new leaf means we've advanced
|
||||||
|
# in the text by the length of that text. However, if there are newlines
|
||||||
|
# -- since this is a preformatted block -- we also advanced by a line.
|
||||||
|
# At this time, do not support links that span multiple lines, but
|
||||||
|
# Agda doesn't produce those either.
|
||||||
|
if at.text?
|
||||||
|
if at.content.include? "\n"
|
||||||
|
raise "no support for links with newlines inside" if at.parent.name != "pre"
|
||||||
|
|
||||||
|
# Increase the line and track the final offset. Written as a loop
|
||||||
|
# in case we eventually want to add some handling for the pieces
|
||||||
|
# sandwiched between newlines.
|
||||||
|
at.content.split("\n", -1).each_with_index do |bit, idx|
|
||||||
|
line += 1 unless idx == 0
|
||||||
|
offset = bit.length
|
||||||
|
end
|
||||||
|
else
|
||||||
|
# It's not a newline node. Just adjust the offset within the plain text.
|
||||||
|
offset += at.content.length
|
||||||
|
end
|
||||||
|
elsif at.name == "a"
|
||||||
|
# Agda emits both links and things-to-link-to as 'a' nodes.
|
||||||
|
|
||||||
|
line_info = line_infos.fetch(line) { line_infos[line] = [] }
|
||||||
|
href = at.attribute("href")
|
||||||
|
id = at.attribute("id")
|
||||||
|
if href or id
|
||||||
|
new_node = { :from => offset-at.content.length, :to => offset }
|
||||||
|
new_node[:href] = href if href
|
||||||
|
new_node[:id] = id if id
|
||||||
|
|
||||||
|
line_info << new_node
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
return line_infos
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
This script takes an Agda HTML file and returns a map in which each line
|
||||||
|
of the Agda source code is associated with a list of ranges; the ranges
|
||||||
|
indicate links or places that can be linked to. For example, for the `ModX`
|
||||||
|
example above, the script might produce:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
3 => [
|
||||||
|
{ :from => 3, :to => 9, id => "..." }, # Agda creates <a> nodes even for keywords.
|
||||||
|
{ :from => 12, :to => 16, id => "ModX-id" }, # The IDs Agda generates aren't usually this nice.
|
||||||
|
{ :from => 20, :to => 21, id => "x-id" },
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
{{< todo >}}This isn't as important probably, but might be worth talking about. {{< /todo >}}
|
||||||
|
|
||||||
|
The most challenging step is probably to identify the Agda "projects" that need
|
||||||
|
to be built. Since different articles have different modules (possibly with
|
||||||
|
the same name), I would need to keep them separate. Also, I'm not ruling
|
||||||
|
out the possibility of one project including another as a submodule. To
|
||||||
|
make this work, I wrote a little Ruby script to find all Agda files,
|
||||||
|
guess their project folder, and invoke the Agda compiler there. It boils
|
||||||
|
down to something like this:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# For each Agda file, find the most specific project / subproject to which
|
||||||
|
# it belongs.
|
||||||
|
files_for_paths = {}
|
||||||
|
Dir.glob("**/*.agda", base: root_path) do |agda_file|
|
||||||
|
best_path = max_path.call(agda_file)
|
||||||
|
files_for_path = files_for_paths.fetch(best_path) do
|
||||||
|
files_for_paths[best_path] = []
|
||||||
|
end
|
||||||
|
|
||||||
|
# Strip the project prefix from the Agda file's path, since
|
||||||
|
# Agda compiler will be invoked in the project's folder.
|
||||||
|
files_for_path << agda_file[best_path.length + File::SEPARATOR.length..-1]
|
||||||
|
end
|
||||||
|
|
||||||
|
original_wd = Dir.getwd
|
||||||
|
files_for_paths.each do |path, files|
|
||||||
|
# There might be a cleaner way of doing this, but it's convenient.
|
||||||
|
Dir.chdir(original_wd)
|
||||||
|
Dir.chdir(File.join(root_path, path))
|
||||||
|
|
||||||
|
# Wherever the target directory is, create a folder that corresponds to
|
||||||
|
# the project being built, to avoid "cross-contaminating" the output
|
||||||
|
# folder with distinct modules with the same name.
|
||||||
|
html_dir = File.join [target_dir, path, "html"]
|
||||||
|
FileUtils.mkdir_p html_dir
|
||||||
|
|
||||||
|
# Just shell out to Agda using the --html folder.
|
||||||
|
files.each do |file|
|
||||||
|
command = "#{ARGV[0]} --local-interfaces #{file} --html --html-dir=#{html_dir}"
|
||||||
|
puts command
|
||||||
|
puts `#{command}`
|
||||||
|
end
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
In short, it traverses all the folders in my `code` directory -- which is where
|
||||||
|
I keep my code, looking for Agda source files. Once it finds them,
|
Loading…
Reference in New Issue
Block a user