Start on a draft about Agda and Hugo
Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
parent
04f12b545d
commit
6a168f2fe1
256
content/blog/agda_hugo.md
Normal file
256
content/blog/agda_hugo.md
Normal file
|
@ -0,0 +1,256 @@
|
|||
---
|
||||
title: "Integrating Agda's HTML Output with Hugo"
|
||||
date: 2024-05-25T21:02:10-07:00
|
||||
draft: true
|
||||
tags: ["Agda", "Hugo", "Ruby", "Nix"]
|
||||
---
|
||||
|
||||
One of my favorite things about Agda are its clickable HTML pages. If you don't
|
||||
know what they are, that's pages like [`Data.List.Properties`](https://agda.github.io/agda-stdlib/master/Data.List.Properties.html);
|
||||
they just give the code from a particular Agda file, but make every identifier
|
||||
clickable. Then, if you see some variable or function that you don't know, you
|
||||
can just click it and jump right to it! It makes exploring the documentation
|
||||
a lot smoother. I've found that these HTML pages provide all the information
|
||||
I need for writing proofs.
|
||||
|
||||
Recently, I've been writing a fair bit about Agda; mostly about the patterns
|
||||
that I've learned about, such as the ["is something" pattern]({{< relref "agda_is_pattern" >}})
|
||||
and the ["deeply embedded expression" trick]({{< relref "agda_expr_pattern" >}}).
|
||||
I've found myself wanting to click on definitions in my own code blocks; recently,
|
||||
I got this working, and I wanted to share how I did it, in case someone else
|
||||
wants to integrate Agda into their own static website. Though my stack
|
||||
is based on Hugo, the general idea should work with any other static site
|
||||
generator.
|
||||
|
||||
### TL;DR and Demo
|
||||
|
||||
I wrote a script to transfer links from an Agda HTML file into Hugo's HTML
|
||||
output, making it possible to embellish "plain" Hugo output with Agda's
|
||||
'go-to-definition links'. It looks like this. Here's an Agda code block
|
||||
defining an 'expression' data type, from a project of mine:
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 543 546 >}}
|
||||
|
||||
And here's the denotational semantics for that expression:
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 586 589 >}}
|
||||
|
||||
Notice that you can click `Expr`, `_∪_`, `⟦`, etc.! All of this integrates
|
||||
with my existing Hugo site, and only required a little bit of additional
|
||||
metadata to make it work.
|
||||
|
||||
Now, the details. Right now, the solution is pretty tailored to my site
|
||||
and workflow, but the core of the script -- a piece that transfers links
|
||||
from an Agda HTML file into a syntax highlighted Hugo HTML block -- should
|
||||
be fairly reusable.
|
||||
|
||||
### The Constraints
|
||||
The goal was simple: to allow the code blocks on my Hugo-generated site to
|
||||
have links that take the user to the definition of a given symbol.
|
||||
Specifically, if the symbol occurs somewhere on the same blog page, the link
|
||||
should take the user there (and not to a regular `Module.html` file). That
|
||||
way, the reader can not only get to the code that they want to see, but also
|
||||
have a chance to read the surrounding prose in properly-rendered Markdown.
|
||||
|
||||
Next, unlike standard "literate Agda" files, my blog posts are not single
|
||||
`.agda` files with Markdown in comments. Rather, I use regular Hugo
|
||||
Markdown, and present portions of an existing project, weaving together many
|
||||
files, and showing the fragments out of order. So, my tool needs to support
|
||||
links that come from distinct modules, in any order.
|
||||
|
||||
Additionally, I've recently been writing a whole series about an Agda project
|
||||
of mine; in this series, I gradually build up to the final product, explaining
|
||||
one or two modules at a time. I would expect that links on pages in this series
|
||||
could jump to other pages in the same series: if I cover module `A` in part 1,
|
||||
then write `A.f` in part 2, clicking on `A` -- and maybe `f` -- should take
|
||||
the reader back to the first part's page; once again, this would help provide
|
||||
them with the surrounding explanation.
|
||||
|
||||
Finally, I wanted the Agda code to appear exactly the same as any other code
|
||||
on my site, including the Hugo-provided syntax highlighting and theme. This
|
||||
ruled out just copy-pasting pieces of the Agda-generated HTML in place of
|
||||
code blocks on my page (and redirecting the links). Thought it was not
|
||||
a hard requirement, I also hoped to include Agda code in the same
|
||||
manner that I include all other code: [my `codelines` shortcode]({{< relref "codelines" >}}).
|
||||
In brief, the `codelines` shortcode creates a syntax-highlighted code block,
|
||||
as well as a surrounding "context" that says what file the code is from,
|
||||
which lines are listed, and where to find the full code (e.g., on my Git server).
|
||||
It looks something like this:
|
||||
|
||||
{{< codelines "Agda" "agda-spa/Language.agda" 20 27 >}}
|
||||
|
||||
In summary:
|
||||
|
||||
1. I want to create cross-links between symbols in Agda blocks in a blog post.
|
||||
2. These code blocks could include code from disjoint files, and be out of order.
|
||||
3. Code blocks among a whole series of posts should be cross-linked too.
|
||||
4. The code blocks should be syntax highlighted the same way as the rest of the
|
||||
code on the site.
|
||||
5. Ideally, I should be able to use my regular method for referencing code.
|
||||
|
||||
I've hit all of these requirements; now it's time to dig into how I got there.
|
||||
|
||||
### Implementation
|
||||
|
||||
It's pretty much a no-go to try to resolve Agda from Hugo, or perform some
|
||||
sort of "heuristic" to detect cross-links. Agda is a very complex programming
|
||||
language, and Hugo's templating engine, though powerful, is just not
|
||||
up to this task. Fortunately, Agda has support for
|
||||
[HTML output using the `--html` flag](https://agda.readthedocs.io/en/v2.6.4.3-r1/tools/generating-html.html).
|
||||
As a build step, I can invoke Agda on files that are referenced by my blog,
|
||||
and generate HTML. This would decidedly slow down the site build process,
|
||||
but it would guarantee accurate link information.
|
||||
|
||||
|
||||
On the other hand, to satisfy the 4th constraint, I need to somehow mimic --
|
||||
or keep -- the format of Hugo's existing HTML output. The easiest way to
|
||||
do this without worrying about breaking changes and version incompatibility
|
||||
is to actually use the existing syntax-highlighted HTML, and annotate it
|
||||
with links as I discover them. Effectively, what I need to do is a "link
|
||||
transfer": I need to identify regions of code that are highlighted in Agda's HTML,
|
||||
find those regions in Hugo's HTML output, and mark them with links. In addition,
|
||||
I'll need to fix up the links themselves: the HTML output assumes that each
|
||||
Agda file is its own HTML page, but this is ruled out by the second constraint.
|
||||
|
||||
As a little visualization, the overall problems looks something like this:
|
||||
|
||||
````Agda {linenos=table}
|
||||
-- Agda's HTML output (blocks of 't' are links):
|
||||
-- |tttttt| |tttt| |t| |t| |ttttt|
|
||||
module ModX ( x : T ) where
|
||||
-- |tttttt| |tt|t| |t| |t| |ttttt|
|
||||
-- Hugo's HTML output (blocks of 't' are syntax highlighting spans)
|
||||
````
|
||||
|
||||
Both Agda and Hugo output a preformatted code block, decorated with various
|
||||
inline HTMl that indicates information (token color for Hugo; symbol IDs and
|
||||
links in Agda). However, Agda and Hugo do not use the same process to create
|
||||
this decorated output; it's entirely possible -- and not uncommon -- for
|
||||
Hugo and Agda to produce misaligned HTML nodes. In my diagram above,
|
||||
this is reflected as `ModX` being considered a single token by Agda, but
|
||||
two tokens (`Mod` and `X`) by the syntax highlighter. As a result, it's
|
||||
difficult to naively iterate the two HTML formats in parallel.
|
||||
|
||||
What I ended up doing is translating Agda's HTML output into offsets and data
|
||||
about the code block's _plain text_ -- the source code being decorated.
|
||||
Both the Agda and Hugo HTML describe the same code; thus, the plain text
|
||||
is the common denominator between the two.
|
||||
|
||||
I wrote a Ruby script to extract the decorations from the Agda output; here
|
||||
it is in slightly abridged form. You can find the [original `agda.rb` file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/04f12b545d5692a78b1a2f13ef968417c749e295/agda.rb).
|
||||
|
||||
```Ruby
|
||||
# Traverse the preformatted Agda block in the given Agda HTML file
|
||||
# and find which textual ranges have IDs and links to other ranges.
|
||||
# Store this information in a hash, line => links[]
|
||||
def process_agda_html_file(file)
|
||||
document = Nokogiri::HTML.parse(File.open(file))
|
||||
pre_code = document.css("pre.Agda")[0]
|
||||
|
||||
# The traversal postorder; we always visit children before their
|
||||
# parents, and we visit leaves in sequence.
|
||||
line_infos = []
|
||||
offset = 0 # Column index within the current Agda source code line
|
||||
line = 1
|
||||
pre_code.traverse do |at|
|
||||
# Text nodes are always leaves; visiting a new leaf means we've advanced
|
||||
# in the text by the length of that text. However, if there are newlines
|
||||
# -- since this is a preformatted block -- we also advanced by a line.
|
||||
# At this time, do not support links that span multiple lines, but
|
||||
# Agda doesn't produce those either.
|
||||
if at.text?
|
||||
if at.content.include? "\n"
|
||||
raise "no support for links with newlines inside" if at.parent.name != "pre"
|
||||
|
||||
# Increase the line and track the final offset. Written as a loop
|
||||
# in case we eventually want to add some handling for the pieces
|
||||
# sandwiched between newlines.
|
||||
at.content.split("\n", -1).each_with_index do |bit, idx|
|
||||
line += 1 unless idx == 0
|
||||
offset = bit.length
|
||||
end
|
||||
else
|
||||
# It's not a newline node. Just adjust the offset within the plain text.
|
||||
offset += at.content.length
|
||||
end
|
||||
elsif at.name == "a"
|
||||
# Agda emits both links and things-to-link-to as 'a' nodes.
|
||||
|
||||
line_info = line_infos.fetch(line) { line_infos[line] = [] }
|
||||
href = at.attribute("href")
|
||||
id = at.attribute("id")
|
||||
if href or id
|
||||
new_node = { :from => offset-at.content.length, :to => offset }
|
||||
new_node[:href] = href if href
|
||||
new_node[:id] = id if id
|
||||
|
||||
line_info << new_node
|
||||
end
|
||||
end
|
||||
end
|
||||
return line_infos
|
||||
end
|
||||
```
|
||||
|
||||
This script takes an Agda HTML file and returns a map in which each line
|
||||
of the Agda source code is associated with a list of ranges; the ranges
|
||||
indicate links or places that can be linked to. For example, for the `ModX`
|
||||
example above, the script might produce:
|
||||
|
||||
```Ruby
|
||||
3 => [
|
||||
{ :from => 3, :to => 9, id => "..." }, # Agda creates <a> nodes even for keywords.
|
||||
{ :from => 12, :to => 16, id => "ModX-id" }, # The IDs Agda generates aren't usually this nice.
|
||||
{ :from => 20, :to => 21, id => "x-id" },
|
||||
]
|
||||
```
|
||||
|
||||
|
||||
{{< todo >}}This isn't as important probably, but might be worth talking about. {{< /todo >}}
|
||||
|
||||
The most challenging step is probably to identify the Agda "projects" that need
|
||||
to be built. Since different articles have different modules (possibly with
|
||||
the same name), I would need to keep them separate. Also, I'm not ruling
|
||||
out the possibility of one project including another as a submodule. To
|
||||
make this work, I wrote a little Ruby script to find all Agda files,
|
||||
guess their project folder, and invoke the Agda compiler there. It boils
|
||||
down to something like this:
|
||||
|
||||
```Ruby
|
||||
# For each Agda file, find the most specific project / subproject to which
|
||||
# it belongs.
|
||||
files_for_paths = {}
|
||||
Dir.glob("**/*.agda", base: root_path) do |agda_file|
|
||||
best_path = max_path.call(agda_file)
|
||||
files_for_path = files_for_paths.fetch(best_path) do
|
||||
files_for_paths[best_path] = []
|
||||
end
|
||||
|
||||
# Strip the project prefix from the Agda file's path, since
|
||||
# Agda compiler will be invoked in the project's folder.
|
||||
files_for_path << agda_file[best_path.length + File::SEPARATOR.length..-1]
|
||||
end
|
||||
|
||||
original_wd = Dir.getwd
|
||||
files_for_paths.each do |path, files|
|
||||
# There might be a cleaner way of doing this, but it's convenient.
|
||||
Dir.chdir(original_wd)
|
||||
Dir.chdir(File.join(root_path, path))
|
||||
|
||||
# Wherever the target directory is, create a folder that corresponds to
|
||||
# the project being built, to avoid "cross-contaminating" the output
|
||||
# folder with distinct modules with the same name.
|
||||
html_dir = File.join [target_dir, path, "html"]
|
||||
FileUtils.mkdir_p html_dir
|
||||
|
||||
# Just shell out to Agda using the --html folder.
|
||||
files.each do |file|
|
||||
command = "#{ARGV[0]} --local-interfaces #{file} --html --html-dir=#{html_dir}"
|
||||
puts command
|
||||
puts `#{command}`
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
In short, it traverses all the folders in my `code` directory -- which is where
|
||||
I keep my code, looking for Agda source files. Once it finds them,
|
Loading…
Reference in New Issue
Block a user