269 lines
12 KiB
Markdown
269 lines
12 KiB
Markdown
|
---
|
||
|
title: "Pleasant Code Includes with Hugo"
|
||
|
date: 2021-01-13T21:31:29-08:00
|
||
|
tags: ["Hugo"]
|
||
|
---
|
||
|
|
||
|
Ever since I started [the compiler series]({{< relref "00_compiler_intro.md" >}}),
|
||
|
I began to include more and more fragments of code into my blog.
|
||
|
I didn't want to be copy-pasting my code between my project
|
||
|
and my Markdown files, so I quickly wrote up a Hugo [shortcode](https://gohugo.io/content-management/shortcodes/)
|
||
|
to pull in other files in the local directory. I've since improved on this
|
||
|
some more, so I thought I'd share what I created with others.
|
||
|
|
||
|
### Including Entire Files and Lines
|
||
|
My needs for snippets were modest at first. For the most part,
|
||
|
I had a single code file that I wanted to present, so it was
|
||
|
acceptable to plop it in the middle of my post in one piece.
|
||
|
The shortcode for that was quite simple:
|
||
|
|
||
|
```
|
||
|
{{ highlight (readFile (printf "code/%s" (.Get 1))) (.Get 0) "" }}
|
||
|
```
|
||
|
|
||
|
This leverages Hugo's built-in [`highlight`](https://gohugo.io/functions/highlight/)
|
||
|
function to provide syntax highlighting to the included snippet. Hugo
|
||
|
doesn't guess at the language of the code, so you have to manually provide
|
||
|
it. Calling this shortcode looks as follows:
|
||
|
|
||
|
```
|
||
|
{{</* codeblock "C++" "compiler/03/type.hpp" */>}}
|
||
|
```
|
||
|
|
||
|
Note that this implicitly adds the `code/` prefix to all
|
||
|
the files I include. This is a personal convention: I want
|
||
|
all my code to be inside a dedicated directory.
|
||
|
|
||
|
Of course, including entire files only takes you so far.
|
||
|
What if you only need to discuss a small part of your code?
|
||
|
Alternaitvely, what if you want to present code piece-by-piece,
|
||
|
in the style of literate programming? I quickly ran into the
|
||
|
need to do this, for which I wrote another shortcode:
|
||
|
|
||
|
```
|
||
|
{{ $s := (readFile (printf "code/%s" (.Get 1))) }}
|
||
|
{{ $t := split $s "\n" }}
|
||
|
{{ if not (eq (int (.Get 2)) 1) }}
|
||
|
{{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }}
|
||
|
{{ else }}
|
||
|
{{ .Scratch.Set "u" $t }}
|
||
|
{{ end }}
|
||
|
{{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }}
|
||
|
{{ if (.Get 4) }}
|
||
|
{{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }}
|
||
|
{{ else }}
|
||
|
{{ .Scratch.Set "opts" "" }}
|
||
|
{{ end }}
|
||
|
{{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }}
|
||
|
```
|
||
|
|
||
|
This shortcode takes a language and a filename as before, but it also takes
|
||
|
the numbers of the first and last lines indicating the part of the code that should be included. After
|
||
|
splitting the contents of the file into lines, it throws away all lines before and
|
||
|
after the window of code that you want to include. It seems to me (from my commit history)
|
||
|
that Hugo's [`after`](https://gohugo.io/functions/after/) function (which should behave
|
||
|
similarly to Haskell's `drop`) doesn't like to be given an argument of `0`.
|
||
|
I had to add a special case for when this would occur, where I simply do not invoke `after` at all.
|
||
|
The shortcode can be used as follows:
|
||
|
|
||
|
```
|
||
|
{{</* codelines "C++" "compiler/04/ast.cpp" 19 22 */>}}
|
||
|
```
|
||
|
|
||
|
To support a fuller range of Hugo's functionality, I also added an optional argument that
|
||
|
accepts Hugo's Chroma settings. This way, I can do things like highlight certain
|
||
|
lines in my code snippet, which is done as follows:
|
||
|
|
||
|
```
|
||
|
{{</* codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 31 39 "hl_lines=7 8 9" */>}}
|
||
|
```
|
||
|
|
||
|
Note that the `hl_lines` field doesn't seem to work properly with `linenostart`, which means
|
||
|
that the highlighted lines are counted from 1 no matter what. This is why in the above snippet,
|
||
|
although I include lines 31 through 39, I feed lines 7, 8, and 9 to `hl_lines`. It's unusual,
|
||
|
but hey, it works!
|
||
|
|
||
|
### Linking to Referenced Code
|
||
|
Some time after implementing my initial system for including lines of code,
|
||
|
I got an email from a reader who pointed out that it was hard for them to find
|
||
|
the exact file I was referencing, and to view the surrounding context of the
|
||
|
presented lines. To address this, I decided that I'd include the link
|
||
|
to the file in question. After all, my website and all the associated
|
||
|
code is on a [Git server I host](https://dev.danilafe.com/Web-Projects/blog-static),
|
||
|
so any local file I'm referencing should -- assuming it was properly committed --
|
||
|
show up there, too. I hardcoded the URL of the `code` directory on the web interface,
|
||
|
and appended the relative path of each included file to it. The shortcode came out as follows:
|
||
|
|
||
|
```
|
||
|
{{ $s := (readFile (printf "code/%s" (.Get 1))) }}
|
||
|
{{ $t := split $s "\n" }}
|
||
|
{{ if not (eq (int (.Get 2)) 1) }}
|
||
|
{{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }}
|
||
|
{{ else }}
|
||
|
{{ .Scratch.Set "u" $t }}
|
||
|
{{ end }}
|
||
|
{{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }}
|
||
|
{{ if (.Get 4) }}
|
||
|
{{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }}
|
||
|
{{ else }}
|
||
|
{{ .Scratch.Set "opts" "" }}
|
||
|
{{ end }}
|
||
|
<div class="highlight-group">
|
||
|
<div class="highlight-label">From <a href="https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/{{ .Get 1 }}">{{ path.Base (.Get 1) }}</a>,
|
||
|
{{ if eq (.Get 2) (.Get 3) }}line {{ .Get 2 }}{{ else }} lines {{ .Get 2 }} through {{ .Get 3 }}{{ end }}</div>
|
||
|
{{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }}
|
||
|
</div>
|
||
|
```
|
||
|
|
||
|
This results in code blocks like the one in the image below. The image
|
||
|
is the result of the `codelines` call for the Idris language, presented above.
|
||
|
|
||
|
{{< figure src="example.png" caption="An example of how the code looks." class="medium" >}}
|
||
|
|
||
|
I got a lot of mileage out of this setup . . . until I wanted to include code from _other_ git repositories.
|
||
|
For instance, I wanted to talk about my [Advent of Code](https://adventofcode.com/) submissions,
|
||
|
without having to copy-paste the code into my blog repository!
|
||
|
|
||
|
### Code from Submodules
|
||
|
My first thought when including code from other repositories was to use submodules.
|
||
|
This has the added advantage of "pinning" the version of the code I'm talking about,
|
||
|
which means that even if I push significant changes to the other repository, the code
|
||
|
in my blog will remain the same. This, in turn, means that all of my `codelines`
|
||
|
shortcodes will work as intended.
|
||
|
|
||
|
The problem is, most Git web interfaces (my own included) don't display paths corresponding
|
||
|
to submodules. Thus, even if all my code is checked out and Hugo correctly
|
||
|
pulls the selected lines into its HTML output, the _links to the file_ remain
|
||
|
broken!
|
||
|
|
||
|
There's no easy way to address this, particularly because _different submodules
|
||
|
can be located on different hosts_! The Git URL used for a submodule is
|
||
|
not known to Hugo (since, to the best of my knowledge, it can't run
|
||
|
shell commands), and it could reside on `dev.danilafe.com`, or `github.com`,
|
||
|
or elsewhere. Fortunately, it's fairly easy to tell when a file is part
|
||
|
of a submodule, and which submodule that is. It's sufficient to find
|
||
|
the longest submodule path that matches the selected file. If no
|
||
|
submodule path matches, then the file is part of the blog repository,
|
||
|
and no special action is needed.
|
||
|
|
||
|
Of course, this means that Hugo needs to be made aware of the various
|
||
|
submodules in my repository. It also needs to be aware of the submodules
|
||
|
_inside_ those submodules, and so on: it needs to be recursive. Git
|
||
|
has a command to list all submodules recursively:
|
||
|
|
||
|
```Bash
|
||
|
git submodule status --recursive
|
||
|
```
|
||
|
|
||
|
However, this only prints the commit, submodule path, and the upstream branch.
|
||
|
I don't think there's a way to list the remotes' URLs with this command; however,
|
||
|
we do _need_ the URLs, since that's how we create links to the Git web interfaces.
|
||
|
|
||
|
There's another issue: how do we let Hugo know about the various submodules,
|
||
|
even if we can find them? Hugo can read files, but doing any serious
|
||
|
text processing is downright impractical. However, Hugo
|
||
|
itself is not able to run commands, so it needs to be able to read in
|
||
|
the output of another command that _can_ find submodules.
|
||
|
|
||
|
I settled on using Hugo's `params` configuration option. This
|
||
|
allows users to communicate arbitrary properties to Hugo themes
|
||
|
and templates. In my case, I want to communicate a collection
|
||
|
of submodules. I didn't know about TOML's inline tables, so
|
||
|
I decided to represent this collection as a map of (meaningless)
|
||
|
submodule names to tables:
|
||
|
|
||
|
```TOML
|
||
|
[params]
|
||
|
[params.submoduleLinks]
|
||
|
[params.submoduleLinks.aoc2020]
|
||
|
url = "https://dev.danilafe.com/Advent-of-Code/AdventOfCode-2020/src/commit/7a8503c3fe1aa7e624e4d8672aa9b56d24b4ba82"
|
||
|
path = "aoc-2020"
|
||
|
```
|
||
|
|
||
|
Since it was seemingly impossible to wrangle Git into outputting
|
||
|
all of this information using one command, I decided
|
||
|
to write a quick Ruby script to generate a list of submodules
|
||
|
as follows. I had to use `cd` in one of my calls to Git
|
||
|
because Git's `--git-dir` option doesn't seem to work
|
||
|
with submodules, treating them like a "bare" checkout.
|
||
|
I also chose to use an allowlist of remote URLs,
|
||
|
since the URL format for linking to files in a
|
||
|
particular repository differs from service to service.
|
||
|
For now, I only use my own Git server, so only `dev.danilafe.com`
|
||
|
is allowed; however, just by adding `elsif`s to my code,
|
||
|
I can add other services in the future.
|
||
|
|
||
|
```Ruby
|
||
|
puts "[params]"
|
||
|
puts " [params.submoduleLinks]"
|
||
|
|
||
|
def each_submodule(base_path)
|
||
|
`cd #{base_path} && git submodule status`.lines do |line|
|
||
|
hash, path = line[1..].split " "
|
||
|
full_path = "#{base_path}/#{path}"
|
||
|
url = `git config --file #{base_path}/.gitmodules --get 'submodule.#{path}.url'`.chomp.delete_suffix(".git")
|
||
|
safe_name = full_path.gsub(/\/|-|_\./, "")
|
||
|
|
||
|
if url =~ /dev.danilafe.com/
|
||
|
file_url = "#{url}/src/commit/#{hash}"
|
||
|
else
|
||
|
raise "Submodule URL #{url.dump} not in a known format!"
|
||
|
end
|
||
|
|
||
|
yield ({ :path => full_path, :url => file_url, :name => safe_name })
|
||
|
each_submodule(full_path) { |m| yield m }
|
||
|
end
|
||
|
end
|
||
|
|
||
|
each_submodule(".") do |m|
|
||
|
next unless m[:path].start_with? "./code/"
|
||
|
puts " [params.submoduleLinks.#{m[:name].delete_prefix(".code")}]"
|
||
|
puts " url = #{m[:url].dump}"
|
||
|
puts " path = #{m[:path].delete_prefix("./code/").dump}"
|
||
|
end
|
||
|
```
|
||
|
|
||
|
I pipe the output of this script into a separate configuration file
|
||
|
called `config-gen.toml`, and then run Hugo as follows:
|
||
|
|
||
|
```
|
||
|
hugo --config config.toml,config-gen.toml
|
||
|
```
|
||
|
|
||
|
Finally, I had to modify my shortcode to find and handle the longest submodule prefix.
|
||
|
Here's the relevant portion, and you can
|
||
|
[view the entire file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/bfeae89ab52d1696c4a56768b7f0c6682efaff82/themes/vanilla/layouts/shortcodes/codelines.html).
|
||
|
|
||
|
```
|
||
|
{{ .Scratch.Set "bestLength" -1 }}
|
||
|
{{ .Scratch.Set "bestUrl" (printf "https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/%s" (.Get 1)) }}
|
||
|
{{ $filePath := (.Get 1) }}
|
||
|
{{ $scratch := .Scratch }}
|
||
|
{{ range $module, $props := .Site.Params.submoduleLinks }}
|
||
|
{{ $path := index $props "path" }}
|
||
|
{{ $bestLength := $scratch.Get "bestLength" }}
|
||
|
{{ if and (le $bestLength (len $path)) (hasPrefix $filePath $path) }}
|
||
|
{{ $scratch.Set "bestLength" (len $path) }}
|
||
|
{{ $scratch.Set "bestUrl" (printf "%s%s" (index $props "url") (strings.TrimPrefix $path $filePath)) }}
|
||
|
{{ end }}
|
||
|
{{ end }}
|
||
|
```
|
||
|
|
||
|
And that's what I'm using at the time of writing!
|
||
|
|
||
|
### Conclusion
|
||
|
My current system for code includes allows me to do the following
|
||
|
things:
|
||
|
|
||
|
* Include entire files or sections of files into the page. This
|
||
|
saves me from having to copy and paste code manually, which
|
||
|
is error prone and can cause inconsistencies.
|
||
|
* Provide links to the files I reference on my Git interface.
|
||
|
This allows users to easily view the entire file that I'm talking about.
|
||
|
* Correctly link to files in repositories other than my blog
|
||
|
repository, when they are included using submodules. This means
|
||
|
I don't need to manually copy and update code from other projects.
|
||
|
|
||
|
I hope some of these shortcodes and script come in handy for someone else.
|
||
|
Thank you for reading!
|