Add Hugo codelines post.
This commit is contained in:
parent
ebdb986e2a
commit
d1ea7b5364
BIN
content/blog/codelines/example.png
Normal file
BIN
content/blog/codelines/example.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 42 KiB |
268
content/blog/codelines/index.md
Normal file
268
content/blog/codelines/index.md
Normal file
@ -0,0 +1,268 @@
|
||||
---
|
||||
title: "Pleasant Code Includes with Hugo"
|
||||
date: 2021-01-13T21:31:29-08:00
|
||||
tags: ["Hugo"]
|
||||
---
|
||||
|
||||
Ever since I started [the compiler series]({{< relref "00_compiler_intro.md" >}}),
|
||||
I began to include more and more fragments of code into my blog.
|
||||
I didn't want to be copy-pasting my code between my project
|
||||
and my Markdown files, so I quickly wrote up a Hugo [shortcode](https://gohugo.io/content-management/shortcodes/)
|
||||
to pull in other files in the local directory. I've since improved on this
|
||||
some more, so I thought I'd share what I created with others.
|
||||
|
||||
### Including Entire Files and Lines
|
||||
My needs for snippets were modest at first. For the most part,
|
||||
I had a single code file that I wanted to present, so it was
|
||||
acceptable to plop it in the middle of my post in one piece.
|
||||
The shortcode for that was quite simple:
|
||||
|
||||
```
|
||||
{{ highlight (readFile (printf "code/%s" (.Get 1))) (.Get 0) "" }}
|
||||
```
|
||||
|
||||
This leverages Hugo's built-in [`highlight`](https://gohugo.io/functions/highlight/)
|
||||
function to provide syntax highlighting to the included snippet. Hugo
|
||||
doesn't guess at the language of the code, so you have to manually provide
|
||||
it. Calling this shortcode looks as follows:
|
||||
|
||||
```
|
||||
{{</* codeblock "C++" "compiler/03/type.hpp" */>}}
|
||||
```
|
||||
|
||||
Note that this implicitly adds the `code/` prefix to all
|
||||
the files I include. This is a personal convention: I want
|
||||
all my code to be inside a dedicated directory.
|
||||
|
||||
Of course, including entire files only takes you so far.
|
||||
What if you only need to discuss a small part of your code?
|
||||
Alternaitvely, what if you want to present code piece-by-piece,
|
||||
in the style of literate programming? I quickly ran into the
|
||||
need to do this, for which I wrote another shortcode:
|
||||
|
||||
```
|
||||
{{ $s := (readFile (printf "code/%s" (.Get 1))) }}
|
||||
{{ $t := split $s "\n" }}
|
||||
{{ if not (eq (int (.Get 2)) 1) }}
|
||||
{{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }}
|
||||
{{ else }}
|
||||
{{ .Scratch.Set "u" $t }}
|
||||
{{ end }}
|
||||
{{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }}
|
||||
{{ if (.Get 4) }}
|
||||
{{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }}
|
||||
{{ else }}
|
||||
{{ .Scratch.Set "opts" "" }}
|
||||
{{ end }}
|
||||
{{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }}
|
||||
```
|
||||
|
||||
This shortcode takes a language and a filename as before, but it also takes
|
||||
the numbers of the first and last lines indicating the part of the code that should be included. After
|
||||
splitting the contents of the file into lines, it throws away all lines before and
|
||||
after the window of code that you want to include. It seems to me (from my commit history)
|
||||
that Hugo's [`after`](https://gohugo.io/functions/after/) function (which should behave
|
||||
similarly to Haskell's `drop`) doesn't like to be given an argument of `0`.
|
||||
I had to add a special case for when this would occur, where I simply do not invoke `after` at all.
|
||||
The shortcode can be used as follows:
|
||||
|
||||
```
|
||||
{{</* codelines "C++" "compiler/04/ast.cpp" 19 22 */>}}
|
||||
```
|
||||
|
||||
To support a fuller range of Hugo's functionality, I also added an optional argument that
|
||||
accepts Hugo's Chroma settings. This way, I can do things like highlight certain
|
||||
lines in my code snippet, which is done as follows:
|
||||
|
||||
```
|
||||
{{</* codelines "Idris" "typesafe-interpreter/TypesafeIntrV3.idr" 31 39 "hl_lines=7 8 9" */>}}
|
||||
```
|
||||
|
||||
Note that the `hl_lines` field doesn't seem to work properly with `linenostart`, which means
|
||||
that the highlighted lines are counted from 1 no matter what. This is why in the above snippet,
|
||||
although I include lines 31 through 39, I feed lines 7, 8, and 9 to `hl_lines`. It's unusual,
|
||||
but hey, it works!
|
||||
|
||||
### Linking to Referenced Code
|
||||
Some time after implementing my initial system for including lines of code,
|
||||
I got an email from a reader who pointed out that it was hard for them to find
|
||||
the exact file I was referencing, and to view the surrounding context of the
|
||||
presented lines. To address this, I decided that I'd include the link
|
||||
to the file in question. After all, my website and all the associated
|
||||
code is on a [Git server I host](https://dev.danilafe.com/Web-Projects/blog-static),
|
||||
so any local file I'm referencing should -- assuming it was properly committed --
|
||||
show up there, too. I hardcoded the URL of the `code` directory on the web interface,
|
||||
and appended the relative path of each included file to it. The shortcode came out as follows:
|
||||
|
||||
```
|
||||
{{ $s := (readFile (printf "code/%s" (.Get 1))) }}
|
||||
{{ $t := split $s "\n" }}
|
||||
{{ if not (eq (int (.Get 2)) 1) }}
|
||||
{{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }}
|
||||
{{ else }}
|
||||
{{ .Scratch.Set "u" $t }}
|
||||
{{ end }}
|
||||
{{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }}
|
||||
{{ if (.Get 4) }}
|
||||
{{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }}
|
||||
{{ else }}
|
||||
{{ .Scratch.Set "opts" "" }}
|
||||
{{ end }}
|
||||
<div class="highlight-group">
|
||||
<div class="highlight-label">From <a href="https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/{{ .Get 1 }}">{{ path.Base (.Get 1) }}</a>,
|
||||
{{ if eq (.Get 2) (.Get 3) }}line {{ .Get 2 }}{{ else }} lines {{ .Get 2 }} through {{ .Get 3 }}{{ end }}</div>
|
||||
{{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }}
|
||||
</div>
|
||||
```
|
||||
|
||||
This results in code blocks like the one in the image below. The image
|
||||
is the result of the `codelines` call for the Idris language, presented above.
|
||||
|
||||
{{< figure src="example.png" caption="An example of how the code looks." class="medium" >}}
|
||||
|
||||
I got a lot of mileage out of this setup . . . until I wanted to include code from _other_ git repositories.
|
||||
For instance, I wanted to talk about my [Advent of Code](https://adventofcode.com/) submissions,
|
||||
without having to copy-paste the code into my blog repository!
|
||||
|
||||
### Code from Submodules
|
||||
My first thought when including code from other repositories was to use submodules.
|
||||
This has the added advantage of "pinning" the version of the code I'm talking about,
|
||||
which means that even if I push significant changes to the other repository, the code
|
||||
in my blog will remain the same. This, in turn, means that all of my `codelines`
|
||||
shortcodes will work as intended.
|
||||
|
||||
The problem is, most Git web interfaces (my own included) don't display paths corresponding
|
||||
to submodules. Thus, even if all my code is checked out and Hugo correctly
|
||||
pulls the selected lines into its HTML output, the _links to the file_ remain
|
||||
broken!
|
||||
|
||||
There's no easy way to address this, particularly because _different submodules
|
||||
can be located on different hosts_! The Git URL used for a submodule is
|
||||
not known to Hugo (since, to the best of my knowledge, it can't run
|
||||
shell commands), and it could reside on `dev.danilafe.com`, or `github.com`,
|
||||
or elsewhere. Fortunately, it's fairly easy to tell when a file is part
|
||||
of a submodule, and which submodule that is. It's sufficient to find
|
||||
the longest submodule path that matches the selected file. If no
|
||||
submodule path matches, then the file is part of the blog repository,
|
||||
and no special action is needed.
|
||||
|
||||
Of course, this means that Hugo needs to be made aware of the various
|
||||
submodules in my repository. It also needs to be aware of the submodules
|
||||
_inside_ those submodules, and so on: it needs to be recursive. Git
|
||||
has a command to list all submodules recursively:
|
||||
|
||||
```Bash
|
||||
git submodule status --recursive
|
||||
```
|
||||
|
||||
However, this only prints the commit, submodule path, and the upstream branch.
|
||||
I don't think there's a way to list the remotes' URLs with this command; however,
|
||||
we do _need_ the URLs, since that's how we create links to the Git web interfaces.
|
||||
|
||||
There's another issue: how do we let Hugo know about the various submodules,
|
||||
even if we can find them? Hugo can read files, but doing any serious
|
||||
text processing is downright impractical. However, Hugo
|
||||
itself is not able to run commands, so it needs to be able to read in
|
||||
the output of another command that _can_ find submodules.
|
||||
|
||||
I settled on using Hugo's `params` configuration option. This
|
||||
allows users to communicate arbitrary properties to Hugo themes
|
||||
and templates. In my case, I want to communicate a collection
|
||||
of submodules. I didn't know about TOML's inline tables, so
|
||||
I decided to represent this collection as a map of (meaningless)
|
||||
submodule names to tables:
|
||||
|
||||
```TOML
|
||||
[params]
|
||||
[params.submoduleLinks]
|
||||
[params.submoduleLinks.aoc2020]
|
||||
url = "https://dev.danilafe.com/Advent-of-Code/AdventOfCode-2020/src/commit/7a8503c3fe1aa7e624e4d8672aa9b56d24b4ba82"
|
||||
path = "aoc-2020"
|
||||
```
|
||||
|
||||
Since it was seemingly impossible to wrangle Git into outputting
|
||||
all of this information using one command, I decided
|
||||
to write a quick Ruby script to generate a list of submodules
|
||||
as follows. I had to use `cd` in one of my calls to Git
|
||||
because Git's `--git-dir` option doesn't seem to work
|
||||
with submodules, treating them like a "bare" checkout.
|
||||
I also chose to use an allowlist of remote URLs,
|
||||
since the URL format for linking to files in a
|
||||
particular repository differs from service to service.
|
||||
For now, I only use my own Git server, so only `dev.danilafe.com`
|
||||
is allowed; however, just by adding `elsif`s to my code,
|
||||
I can add other services in the future.
|
||||
|
||||
```Ruby
|
||||
puts "[params]"
|
||||
puts " [params.submoduleLinks]"
|
||||
|
||||
def each_submodule(base_path)
|
||||
`cd #{base_path} && git submodule status`.lines do |line|
|
||||
hash, path = line[1..].split " "
|
||||
full_path = "#{base_path}/#{path}"
|
||||
url = `git config --file #{base_path}/.gitmodules --get 'submodule.#{path}.url'`.chomp.delete_suffix(".git")
|
||||
safe_name = full_path.gsub(/\/|-|_\./, "")
|
||||
|
||||
if url =~ /dev.danilafe.com/
|
||||
file_url = "#{url}/src/commit/#{hash}"
|
||||
else
|
||||
raise "Submodule URL #{url.dump} not in a known format!"
|
||||
end
|
||||
|
||||
yield ({ :path => full_path, :url => file_url, :name => safe_name })
|
||||
each_submodule(full_path) { |m| yield m }
|
||||
end
|
||||
end
|
||||
|
||||
each_submodule(".") do |m|
|
||||
next unless m[:path].start_with? "./code/"
|
||||
puts " [params.submoduleLinks.#{m[:name].delete_prefix(".code")}]"
|
||||
puts " url = #{m[:url].dump}"
|
||||
puts " path = #{m[:path].delete_prefix("./code/").dump}"
|
||||
end
|
||||
```
|
||||
|
||||
I pipe the output of this script into a separate configuration file
|
||||
called `config-gen.toml`, and then run Hugo as follows:
|
||||
|
||||
```
|
||||
hugo --config config.toml,config-gen.toml
|
||||
```
|
||||
|
||||
Finally, I had to modify my shortcode to find and handle the longest submodule prefix.
|
||||
Here's the relevant portion, and you can
|
||||
[view the entire file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/bfeae89ab52d1696c4a56768b7f0c6682efaff82/themes/vanilla/layouts/shortcodes/codelines.html).
|
||||
|
||||
```
|
||||
{{ .Scratch.Set "bestLength" -1 }}
|
||||
{{ .Scratch.Set "bestUrl" (printf "https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/%s" (.Get 1)) }}
|
||||
{{ $filePath := (.Get 1) }}
|
||||
{{ $scratch := .Scratch }}
|
||||
{{ range $module, $props := .Site.Params.submoduleLinks }}
|
||||
{{ $path := index $props "path" }}
|
||||
{{ $bestLength := $scratch.Get "bestLength" }}
|
||||
{{ if and (le $bestLength (len $path)) (hasPrefix $filePath $path) }}
|
||||
{{ $scratch.Set "bestLength" (len $path) }}
|
||||
{{ $scratch.Set "bestUrl" (printf "%s%s" (index $props "url") (strings.TrimPrefix $path $filePath)) }}
|
||||
{{ end }}
|
||||
{{ end }}
|
||||
```
|
||||
|
||||
And that's what I'm using at the time of writing!
|
||||
|
||||
### Conclusion
|
||||
My current system for code includes allows me to do the following
|
||||
things:
|
||||
|
||||
* Include entire files or sections of files into the page. This
|
||||
saves me from having to copy and paste code manually, which
|
||||
is error prone and can cause inconsistencies.
|
||||
* Provide links to the files I reference on my Git interface.
|
||||
This allows users to easily view the entire file that I'm talking about.
|
||||
* Correctly link to files in repositories other than my blog
|
||||
repository, when they are included using submodules. This means
|
||||
I don't need to manually copy and update code from other projects.
|
||||
|
||||
I hope some of these shortcodes and script come in handy for someone else.
|
||||
Thank you for reading!
|
Loading…
Reference in New Issue
Block a user