From b6313463791e4abe1663c253b771d9725d63c7c5 Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Tue, 21 Jul 2020 14:55:52 -0700 Subject: [PATCH] Publish the mathematics post. --- content/blog/backend_math_rendering.md | 36 +++++++++++++++++--------- 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/content/blog/backend_math_rendering.md b/content/blog/backend_math_rendering.md index 6095dc5..5185e5e 100644 --- a/content/blog/backend_math_rendering.md +++ b/content/blog/backend_math_rendering.md @@ -1,8 +1,7 @@ --- title: Rendering Mathematics On The Back End -date: 2020-07-15T15:27:19-07:00 -draft: true -tags: ["Website", "Nix", "Ruby", "KaTeX", "Hugo"] +date: 2020-07-21T14:54:26-07:00 +tags: ["Website", "Nix", "Ruby", "KaTeX"] --- Due to something of a streak of bad luck when it came to computers, I spent a @@ -26,7 +25,7 @@ for displaying things like inference rules, didn't work without JavaScript. I was left with two options: * Allow JavaScript, and continue using MathJax to render my math. -* Make it so that the mathematics is rendered on the back end. +* Make it so that the mathematics are rendered on the back end. I've [previously written about math rendering]({{< relref "math_rendering_is_wrong.md" >}}), and made the observation that MathJax's output for LaTeX is __identical__ @@ -105,10 +104,13 @@ page advertises server-side rendering. Their documentation [(link)](https://kate even shows (at least as of the time this email was sent) that it renders both HTML (to be arranged nicely with their CSS) for visuals and MathML for accessibility. +The author of the email then kindly provided a link to a page they generated using KaTeX and +some Bash scripts. The math on this page was rendered at the time it was generated. + This is a great point, and KaTeX is indeed usable for server-side rendering. But I've seen few people who do actually use it. Unfortunately, as I pointed out in my previous post on the subject, -few tools remain that provide the software that actually takes your HTML page and substitutes -LaTeX for math. +few tools that actually take your HTML page and replace LaTeX with rendered math. +Here's what I wrote about this last time: > [In MathJax,] The bigger issue, though, was that the `page2html` program, which rendered all the mathematics in a single HTML page, @@ -119,8 +121,14 @@ which replaced mathematical expressions in a page with their SVG forms. This is still the case, in both MathJax and KaTeX. The ability to render math in one step is the main selling point of front-end LaTeX renderers: all you have to do is drop in a file from a CDN, and voila, you have your -math. There are no such easy answers for back-end rendering. I decided -to write my own Ruby script to get the job done. From this script, I +math. There are no such easy answers for back-end rendering. In fact, +as we will soon see, it's not possible to just search-and-replace occurences +of mathematics on your page, either. To actually get KaTeX working +on the backend, you need access to tools that handle the potential variety +of edge cases associated with HTML. Such tools, to my knowledge, do not +currently exist. + +I decided to write my own Ruby script to get the job done. From this script, I would call the `katex` command-line program, which would perform the heavy lifting of rendering the mathematics. @@ -170,13 +178,16 @@ end There's a bit of a trick to the final layer of this script. We want to be really careful about where we replace LaTeX, and where we don't. In particular, we _don't_ want to go into the `code` tags. Otherwise, -it wouldn't be possible to talk about LaTeX code! Thus, we can't just -search-and-replace over the entire HTML document; we need to be context +it wouldn't be possible to talk about LaTeX code! I also suspect that +some captions, alt texts, and similar elements should also be left alone. +However, I don't have those on my website (yet), and I won't worry about +them now. Either way, because of the code tags, +we can't just search-and-replace over the entire page; we need to be context aware. This is where `nokigiri` comes in. We parse the HTML, and iterate over all of the 'text' nodes, calling `perform_katex_sub` on all of those that _aren't_ inside code tags. -Fortunately, this is pretty easy to specify thanks to something called XPath. +Fortunately, this kind of iteration is pretty easy to specify thanks to something called XPath. This was my first time encountering it, but it seems extremely useful: it's a sort of language for selecting XML nodes. First, you provide an 'axis', which is used to specify the positions of the nodes you want to look at @@ -236,7 +247,8 @@ I used Nix for this, but the below script will largely be compatible with a non- I came up with the following, commenting on Nix-specific commands: ```Bash {linenos=table} -source $stdenv/setup # Nix-specific; set up paths. +# Nix-specific; set up paths. +source $stdenv/setup # Build site with Hugo # The cp is Nix-specific; it copies the blog source into the current directory.