Compare commits

..

No commits in common. "fdaec6d5a9c4b5f206a6110855c7b634de4f480d" and "e9f2378b47542c0a8c12f6a365720d96f986630c" have entirely different histories.

View File

@ -1,7 +1,8 @@
--- ---
title: Rendering Mathematics On The Back End title: Rendering Mathematics On The Back End
date: 2020-07-21T14:54:26-07:00 date: 2020-07-15T15:27:19-07:00
tags: ["Website", "Nix", "Ruby", "KaTeX"] draft: true
tags: ["Website", "Nix", "Ruby", "KaTeX", "Hugo"]
--- ---
Due to something of a streak of bad luck when it came to computers, I spent a Due to something of a streak of bad luck when it came to computers, I spent a
@ -25,7 +26,7 @@ for displaying things like inference rules, didn't work without
JavaScript. I was left with two options: JavaScript. I was left with two options:
* Allow JavaScript, and continue using MathJax to render my math. * Allow JavaScript, and continue using MathJax to render my math.
* Make it so that the mathematics are rendered on the back end. * Make it so that the mathematics is rendered on the back end.
I've [previously written about math rendering]({{< relref "math_rendering_is_wrong.md" >}}), I've [previously written about math rendering]({{< relref "math_rendering_is_wrong.md" >}}),
and made the observation that MathJax's output for LaTeX is __identical__ and made the observation that MathJax's output for LaTeX is __identical__
@ -89,7 +90,7 @@ to `node2nix`:
] ]
``` ```
The Ruby script I wrote for this (more on that soon) required the `nokogiri` gem, which The Ruby script I wrote for this (more on that soon) required the `nokigiri` gem, which
I used for traversing the HTML generated for my site. Hugo was obviously required to I used for traversing the HTML generated for my site. Hugo was obviously required to
generate the HTML. generate the HTML.
@ -104,13 +105,10 @@ page advertises server-side rendering. Their documentation [(link)](https://kate
even shows (at least as of the time this email was sent) that it renders both HTML even shows (at least as of the time this email was sent) that it renders both HTML
(to be arranged nicely with their CSS) for visuals and MathML for accessibility. (to be arranged nicely with their CSS) for visuals and MathML for accessibility.
The author of the email then kindly provided a link to a page they generated using KaTeX and
some Bash scripts. The math on this page was rendered at the time it was generated.
This is a great point, and KaTeX is indeed usable for server-side rendering. But I've This is a great point, and KaTeX is indeed usable for server-side rendering. But I've
seen few people who do actually use it. Unfortunately, as I pointed out in my previous post on the subject, seen few people who do actually use it. Unfortunately, as I pointed out in my previous post on the subject,
few tools actually take your HTML page and replace LaTeX with rendered math. few tools remain that provide the software that actually takes your HTML page and substitutes
Here's what I wrote about this last time: LaTeX for math.
> [In MathJax,] The bigger issue, though, was that the `page2html` > [In MathJax,] The bigger issue, though, was that the `page2html`
program, which rendered all the mathematics in a single HTML page, program, which rendered all the mathematics in a single HTML page,
@ -121,14 +119,8 @@ which replaced mathematical expressions in a page with their SVG forms.
This is still the case, in both MathJax and KaTeX. The ability This is still the case, in both MathJax and KaTeX. The ability
to render math in one step is the main selling point of front-end LaTeX renderers: to render math in one step is the main selling point of front-end LaTeX renderers:
all you have to do is drop in a file from a CDN, and voila, you have your all you have to do is drop in a file from a CDN, and voila, you have your
math. There are no such easy answers for back-end rendering. In fact, math. There are no such easy answers for back-end rendering. I decided
as we will soon see, it's not possible to just search-and-replace occurences to write my own Ruby script to get the job done. From this script, I
of mathematics on your page, either. To actually get KaTeX working
on the backend, you need access to tools that handle the potential variety
of edge cases associated with HTML. Such tools, to my knowledge, do not
currently exist.
I decided to write my own Ruby script to get the job done. From this script, I
would call the `katex` command-line program, which would perform would call the `katex` command-line program, which would perform
the heavy lifting of rendering the mathematics. the heavy lifting of rendering the mathematics.
@ -178,16 +170,13 @@ end
There's a bit of a trick to the final layer of this script. We want to be There's a bit of a trick to the final layer of this script. We want to be
really careful about where we replace LaTeX, and where we don't. In really careful about where we replace LaTeX, and where we don't. In
particular, we _don't_ want to go into the `code` tags. Otherwise, particular, we _don't_ want to go into the `code` tags. Otherwise,
it wouldn't be possible to talk about LaTeX code! I also suspect that it wouldn't be possible to talk about LaTeX code! Thus, we can't just
some captions, alt texts, and similar elements should also be left alone. search-and-replace over the entire HTML document; we need to be context
However, I don't have those on my website (yet), and I won't worry about aware. This is where `nokigiri` comes in. We parse the HTML, and iterate
them now. Either way, because of the code tags,
we can't just search-and-replace over the entire page; we need to be context
aware. This is where `nokogiri` comes in. We parse the HTML, and iterate
over all of the 'text' nodes, calling `perform_katex_sub` on all over all of the 'text' nodes, calling `perform_katex_sub` on all
of those that _aren't_ inside code tags. of those that _aren't_ inside code tags.
Fortunately, this kind of iteration is pretty easy to specify thanks to something called XPath. Fortunately, this is pretty easy to specify thanks to something called XPath.
This was my first time encountering it, but it seems extremely useful: it's This was my first time encountering it, but it seems extremely useful: it's
a sort of language for selecting XML nodes. First, you provide an 'axis', a sort of language for selecting XML nodes. First, you provide an 'axis',
which is used to specify the positions of the nodes you want to look at which is used to specify the positions of the nodes you want to look at
@ -222,7 +211,7 @@ All in all:
//*[not(self::code)]/text() //*[not(self::code)]/text()
``` ```
Finally, we use this XPath from `nokogiri`: Finally, we use this XPath from `nokigiri`:
```Ruby {linenos=table} ```Ruby {linenos=table}
files = ARGV[0..-1] files = ARGV[0..-1]
@ -247,8 +236,7 @@ I used Nix for this, but the below script will largely be compatible with a non-
I came up with the following, commenting on Nix-specific commands: I came up with the following, commenting on Nix-specific commands:
```Bash {linenos=table} ```Bash {linenos=table}
# Nix-specific; set up paths. source $stdenv/setup # Nix-specific; set up paths.
source $stdenv/setup
# Build site with Hugo # Build site with Hugo
# The cp is Nix-specific; it copies the blog source into the current directory. # The cp is Nix-specific; it copies the blog source into the current directory.
@ -278,7 +266,7 @@ take a few dozen seconds to run on my relatively small site. The
better approach would be to use a NodeJS script, rather than a Ruby one, better approach would be to use a NodeJS script, rather than a Ruby one,
to perform the conversion. KaTeX also provides an API, so such a NodeJS to perform the conversion. KaTeX also provides an API, so such a NodeJS
script can find the files, parse the HTML, and perform the substitutions. script can find the files, parse the HTML, and perform the substitutions.
I did quite like using `nokogiri` here, though, and I hope that an equivalently I did quite like using `nokigiri` here, though, and I hope that an equivalently
pleasant solution exists in JavaScript. pleasant solution exists in JavaScript.
Re-rendering the whole website is also pretty wasteful. I rarely change the Re-rendering the whole website is also pretty wasteful. I rarely change the
@ -287,15 +275,6 @@ to re-run the script, and therefore re-render every page. This makes sense
for me, since I use Nix, and my builds are pretty much always performed for me, since I use Nix, and my builds are pretty much always performed
from scratch. On the other hand, for others, this may not be the best solution. from scratch. On the other hand, for others, this may not be the best solution.
### Alternatives
The same person who sent me the original email above also pointed out
[this `pandoc` filter for KaTeX](https://github.com/Zaharid/pandoc_static_katex).
I do not use Pandoc, but from what I can see, this fitler relies on
Pandoc's `Math` AST nodes, and applies KaTeX to each of those. This
should work, but wasn't applicable in my case, since Hugo's shrotcodes
don't mix well with Pandoc. However, it certainly seems like a workable
solution.
### Conclusion ### Conclusion
With the removal of MathJax from my site, it is now completely JavaScript free, With the removal of MathJax from my site, it is now completely JavaScript free,
and contains virtually the same HTML that it did beforehand. This, I hope, and contains virtually the same HTML that it did beforehand. This, I hope,