--- title: "Pleasant Code Includes with Hugo" date: 2021-01-13T21:31:29-08:00 tags: ["Hugo"] --- Ever since I started [the compiler series]({{< relref "00_compiler_intro.md" >}}), I began to include more and more fragments of code into my blog. I didn't want to be copy-pasting my code between my project and my Markdown files, so I quickly wrote up a Hugo [shortcode](https://gohugo.io/content-management/shortcodes/) to pull in other files in the local directory. I've since improved on this some more, so I thought I'd share what I created with others. ### Including Entire Files and Lines My needs for snippets were modest at first. For the most part, I had a single code file that I wanted to present, so it was acceptable to plop it in the middle of my post in one piece. The shortcode for that was quite simple: ``` {{ highlight (readFile (printf "code/%s" (.Get 1))) (.Get 0) "" }} ``` This leverages Hugo's built-in [`highlight`](https://gohugo.io/functions/highlight/) function to provide syntax highlighting to the included snippet. Hugo doesn't guess at the language of the code, so you have to manually provide it. Calling this shortcode looks as follows: ``` {{}} ``` Note that this implicitly adds the `code/` prefix to all the files I include. This is a personal convention: I want all my code to be inside a dedicated directory. Of course, including entire files only takes you so far. What if you only need to discuss a small part of your code? Alternaitvely, what if you want to present code piece-by-piece, in the style of literate programming? I quickly ran into the need to do this, for which I wrote another shortcode: ``` {{ $s := (readFile (printf "code/%s" (.Get 1))) }} {{ $t := split $s "\n" }} {{ if not (eq (int (.Get 2)) 1) }} {{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }} {{ else }} {{ .Scratch.Set "u" $t }} {{ end }} {{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }} {{ if (.Get 4) }} {{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }} {{ else }} {{ .Scratch.Set "opts" "" }} {{ end }} {{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }} ``` This shortcode takes a language and a filename as before, but it also takes the numbers of the first and last lines indicating the part of the code that should be included. After splitting the contents of the file into lines, it throws away all lines before and after the window of code that you want to include. It seems to me (from my commit history) that Hugo's [`after`](https://gohugo.io/functions/after/) function (which should behave similarly to Haskell's `drop`) doesn't like to be given an argument of `0`. I had to add a special case for when this would occur, where I simply do not invoke `after` at all. The shortcode can be used as follows: ``` {{}} ``` To support a fuller range of Hugo's functionality, I also added an optional argument that accepts Hugo's Chroma settings. This way, I can do things like highlight certain lines in my code snippet, which is done as follows: ``` {{}} ``` Note that the `hl_lines` field doesn't seem to work properly with `linenostart`, which means that the highlighted lines are counted from 1 no matter what. This is why in the above snippet, although I include lines 31 through 39, I feed lines 7, 8, and 9 to `hl_lines`. It's unusual, but hey, it works! ### Linking to Referenced Code Some time after implementing my initial system for including lines of code, I got an email from a reader who pointed out that it was hard for them to find the exact file I was referencing, and to view the surrounding context of the presented lines. To address this, I decided that I'd include the link to the file in question. After all, my website and all the associated code is on a [Git server I host](https://dev.danilafe.com/Web-Projects/blog-static), so any local file I'm referencing should -- assuming it was properly committed -- show up there, too. I hardcoded the URL of the `code` directory on the web interface, and appended the relative path of each included file to it. The shortcode came out as follows: ``` {{ $s := (readFile (printf "code/%s" (.Get 1))) }} {{ $t := split $s "\n" }} {{ if not (eq (int (.Get 2)) 1) }} {{ .Scratch.Set "u" (after (sub (int (.Get 2)) 1) $t) }} {{ else }} {{ .Scratch.Set "u" $t }} {{ end }} {{ $v := first (add (sub (int (.Get 3)) (int (.Get 2))) 1) (.Scratch.Get "u") }} {{ if (.Get 4) }} {{ .Scratch.Set "opts" (printf ",%s" (.Get 4)) }} {{ else }} {{ .Scratch.Set "opts" "" }} {{ end }}
From {{ path.Base (.Get 1) }}, {{ if eq (.Get 2) (.Get 3) }}line {{ .Get 2 }}{{ else }} lines {{ .Get 2 }} through {{ .Get 3 }}{{ end }}
{{ highlight (delimit $v "\n") (.Get 0) (printf "linenos=table,linenostart=%d%s" (.Get 2) (.Scratch.Get "opts")) }}
``` This results in code blocks like the one in the image below. The image is the result of the `codelines` call for the Idris language, presented above. {{< figure src="example.png" caption="An example of how the code looks." class="medium" >}} I got a lot of mileage out of this setup . . . until I wanted to include code from _other_ git repositories. For instance, I wanted to talk about my [Advent of Code](https://adventofcode.com/) submissions, without having to copy-paste the code into my blog repository! ### Code from Submodules My first thought when including code from other repositories was to use submodules. This has the added advantage of "pinning" the version of the code I'm talking about, which means that even if I push significant changes to the other repository, the code in my blog will remain the same. This, in turn, means that all of my `codelines` shortcodes will work as intended. The problem is, most Git web interfaces (my own included) don't display paths corresponding to submodules. Thus, even if all my code is checked out and Hugo correctly pulls the selected lines into its HTML output, the _links to the file_ remain broken! There's no easy way to address this, particularly because _different submodules can be located on different hosts_! The Git URL used for a submodule is not known to Hugo (since, to the best of my knowledge, it can't run shell commands), and it could reside on `dev.danilafe.com`, or `github.com`, or elsewhere. Fortunately, it's fairly easy to tell when a file is part of a submodule, and which submodule that is. It's sufficient to find the longest submodule path that matches the selected file. If no submodule path matches, then the file is part of the blog repository, and no special action is needed. Of course, this means that Hugo needs to be made aware of the various submodules in my repository. It also needs to be aware of the submodules _inside_ those submodules, and so on: it needs to be recursive. Git has a command to list all submodules recursively: ```Bash git submodule status --recursive ``` However, this only prints the commit, submodule path, and the upstream branch. I don't think there's a way to list the remotes' URLs with this command; however, we do _need_ the URLs, since that's how we create links to the Git web interfaces. There's another issue: how do we let Hugo know about the various submodules, even if we can find them? Hugo can read files, but doing any serious text processing is downright impractical. However, Hugo itself is not able to run commands, so it needs to be able to read in the output of another command that _can_ find submodules. I settled on using Hugo's `params` configuration option. This allows users to communicate arbitrary properties to Hugo themes and templates. In my case, I want to communicate a collection of submodules. I didn't know about TOML's inline tables, so I decided to represent this collection as a map of (meaningless) submodule names to tables: ```TOML [params] [params.submoduleLinks] [params.submoduleLinks.aoc2020] url = "https://dev.danilafe.com/Advent-of-Code/AdventOfCode-2020/src/commit/7a8503c3fe1aa7e624e4d8672aa9b56d24b4ba82" path = "aoc-2020" ``` Since it was seemingly impossible to wrangle Git into outputting all of this information using one command, I decided to write a quick Ruby script to generate a list of submodules as follows. I had to use `cd` in one of my calls to Git because Git's `--git-dir` option doesn't seem to work with submodules, treating them like a "bare" checkout. I also chose to use an allowlist of remote URLs, since the URL format for linking to files in a particular repository differs from service to service. For now, I only use my own Git server, so only `dev.danilafe.com` is allowed; however, just by adding `elsif`s to my code, I can add other services in the future. ```Ruby puts "[params]" puts " [params.submoduleLinks]" def each_submodule(base_path) `cd #{base_path} && git submodule status`.lines do |line| hash, path = line[1..].split " " full_path = "#{base_path}/#{path}" url = `git config --file #{base_path}/.gitmodules --get 'submodule.#{path}.url'`.chomp.delete_suffix(".git") safe_name = full_path.gsub(/\/|-|_\./, "") if url =~ /dev.danilafe.com/ file_url = "#{url}/src/commit/#{hash}" else raise "Submodule URL #{url.dump} not in a known format!" end yield ({ :path => full_path, :url => file_url, :name => safe_name }) each_submodule(full_path) { |m| yield m } end end each_submodule(".") do |m| next unless m[:path].start_with? "./code/" puts " [params.submoduleLinks.#{m[:name].delete_prefix(".code")}]" puts " url = #{m[:url].dump}" puts " path = #{m[:path].delete_prefix("./code/").dump}" end ``` I pipe the output of this script into a separate configuration file called `config-gen.toml`, and then run Hugo as follows: ``` hugo --config config.toml,config-gen.toml ``` Finally, I had to modify my shortcode to find and handle the longest submodule prefix. Here's the relevant portion, and you can [view the entire file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/bfeae89ab52d1696c4a56768b7f0c6682efaff82/themes/vanilla/layouts/shortcodes/codelines.html). ``` {{ .Scratch.Set "bestLength" -1 }} {{ .Scratch.Set "bestUrl" (printf "https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/code/%s" (.Get 1)) }} {{ $filePath := (.Get 1) }} {{ $scratch := .Scratch }} {{ range $module, $props := .Site.Params.submoduleLinks }} {{ $path := index $props "path" }} {{ $bestLength := $scratch.Get "bestLength" }} {{ if and (le $bestLength (len $path)) (hasPrefix $filePath $path) }} {{ $scratch.Set "bestLength" (len $path) }} {{ $scratch.Set "bestUrl" (printf "%s%s" (index $props "url") (strings.TrimPrefix $path $filePath)) }} {{ end }} {{ end }} ``` And that's what I'm using at the time of writing! ### Conclusion My current system for code includes allows me to do the following things: * Include entire files or sections of files into the page. This saves me from having to copy and paste code manually, which is error prone and can cause inconsistencies. * Provide links to the files I reference on my Git interface. This allows users to easily view the entire file that I'm talking about. * Correctly link to files in repositories other than my blog repository, when they are included using submodules. This means I don't need to manually copy and update code from other projects. I hope some of these shortcodes and script come in handy for someone else. Thank you for reading!