diff --git a/content/blog/math_rendering_is_wrong.md b/content/blog/math_rendering_is_wrong.md new file mode 100644 index 0000000..f8a4cf5 --- /dev/null +++ b/content/blog/math_rendering_is_wrong.md @@ -0,0 +1,172 @@ +--- +title: Math Rendering is Wrong +date: 2020-03-15T15:44:06-07:00 +tags: ["Website"] +draft: true +--- + +Since I first started working on my website at age fourteen, the site +has gone through many revisions, and hopefully changed for the better. +This blog was originally dynamically served using a Python/Flask backend, +having a custom login system and post "editor" (just an input box). +One of the more strange things about my website, though, was how I displayed +content. + +It was clear to me, even at my young age, that writing raw HTML was +suboptimal. Somehow (perhaps through GitHub) I heard about Markdown, +and realized that a human-readable markup language was probably a much +better way to go. What remained was, of course, rendering the content. +The easiest way I found was to just stick a JavaScript script, calling +out to [marked](https://github.com/markedjs/marked), to run on page +load and convert all the markup into pretty HTML. + +This rendering would happen on every page load. Every time I navigated +between pages on my site, for a second or two, I'd see the raw, unrendered +Markdown that I had written, which would then disappear and be replaced +with a proper view of the page's content. The rendering wasn't error-proof, +either. If my connection was particularly slow (which it was, thanks, Comcast), +or if I forgot to disable uMatrix, I would be left having to sift through +the frequent occurences of `#`, `_`, `*`... Eventually I realized my mistake, +and switched to rendering Markdown on the backend. Now, my content +would appear to the user already formatted, and they wouldn't have to +wait for a JavaScript program to finish to read what I had written. +All was well. + +Sometimes, I look back on the early iterations of my site, and smile +at the silly mistakes I'd made. Yet I can't innocently make fun of +my original Markdown rendering solution, because in a way, it lives on. + +### The State of Mathematics on the Web + +When I search for "render math on website" on Google, +the following two links are at the top of my search: + +* [MathJax | Beautiful math in all browsers](https://www.mathjax.org) +* [KaTeX - the fastests math typesetting library for the web](https://katex.org) + +Indeed, these are the two most popular math rendering solutions +(in my experience). Yet both of these solutiosn share something something +in common with each other and with the early iterations of my website - +they use client-side rendering for static content. In my opinion, this is absurd. +Every time that you visit a website that uses MathJax or KaTeX, any mathematical +notation arrives to your machine in the form of LaTex markup. For instance, +\\(e^{-\\frac{x}{2}}\\) looks like `e^{-\frac{x}{2}}`. The rendering +software (MathJax or KaTeX) then takes this markup and converts it into +HTML and CSS that your browser can display. Just like my old website, +all of this happens __every time the page is loaded__. This isn't an uncommon +thing, either: websites like +[Mathematics StackExchange](https://math.stackexchange.com) and +[Chegg](https://www.chegg.com) use MathJax, and many more can be found +on [this list](https://docs.mathjax.org/en/v2.7-latest/misc/mathjax-in-use.html). +According to [this page](https://katex.org/users.html), Facebook Messenger, +[Khan Academy](https://www.khanacademy.org/), [Gitter](https://gitter.im/) +and [GitLab](https://about.gitlab.com/) use KaTeX. Some of these websites +don't even load their content with JavaScript enabled, much less render +math. + +A skeptic might say that it is not possible to render LaTeX to HTML and +CSS ahead of time. This might even have been true in the past. A user +on the +["can we replace client-side MathJax with server-side MathJax"](https://meta.mathoverflow.net/questions/2360/can-we-replace-client-side-mathjax-with-server-side-mathjax) thread on MathOverflow Meta in 2015 points out that MathJax doesn't support +server-side HTML output: + +> I found [[the comment]](https://math.meta.stackexchange.com/questions/16809/a-mathjax-alternative-from-khan-academy#comment62132_16817) from a MathJax developer confirming that HTML+CSS doesn't work yet with the server-side version . . . + +The comment is even older, from 2014: + +> . . . [MathJax does not support] HTML-CSS at the moment . . . + +It's over, go home everyone. We are asking for the impossible. + +Or are we? Version 2.6 of MathJax has the following comment in its change log: + +> _Improved CommonHTML output_. The CommonHTML output now provides the same layout quality and MathML support as the HTML-CSS and SVG output. It is on average 40% faster than the other outputs and the markup it produces are identical on all browsers and thus can also be pre-generated on the server via MathJax-node. + +Further, the [HTML Support](http://docs.mathjax.org/en/latest/output/html.html) +page from MathJax's docs states: + +> [CommonHTML] is MathJax’s primary output mode since MathJax version 2.6. Its major advantage is its quality, consistency, and the fact that its output is independent of the browser, operating system, and user environment + +So not only is it explicitly possible to have a server-generated math output, +but the algorithm that would be used for generating such output is already +in use on the client-side! If you look hard enough, you may even find a few +resources for using this algorithm server-side, but many of those suffer from +another problem... + +### Images Won't Cut It + +It's tempting to convert mathematics to an image, such as a PNG or an SVG file. +This approach is taken on [this blog](https://blog.oniuo.com/post/math-jax-ssr-example/) and this [Advanced Web Machinery](https://advancedweb.hu/mathjax-processing-on-the-server-side/) article. [Wolfram MathWorld](http://mathworld.wolfram.com/Convergent.html) also render their mathematics to images. However, in my opion, +this is __not the right approach__. It is inconvenient for me as a user, +and, I suspect, for those in need of assistive technologies. Here are +the issues I see with rendering mathematics to images: + +* Images are impossible to use with copy/paste. I am unable to select a word, +number, or symbol in a rendered image. I do this on occasion, and this is +not a contrived issue. +* Images are not nearly as responsive, and are difficult to style. +Line breaking, fonts, and even colors are difficult to change when using images. +They stick out like a sore thumb when used for inline math, and can +look very strange (or disappear) if a user extension manipulates colors on the +page. +* Images are completely opaque to users in need of screen readers. While +some readers support +{{< sidenote "right" "mathml-note" "MathML," >}} +MathML is an XML-based markup for mathematics, meant to serve as a low-level +ouput target for other math processors. While it is supported in Firefox, +it requires a Polyfill in Chrome, and brings us back to front-end JavaScript. +{{< /sidenote >}} images are much harder to work with. The sites +that I linked that use images do not have captions with their math, +making it completely inaccessible. +* Using images instead of a JavaScript-based renderer reminds me of +the [false dichotomy fallacy](https://www.logicallyfallacious.com/cgi-bin/uy/webpages.cgi?/logicalfallacies/False-Dilemma). Why can't we do what we do, +but using the server? + +### Where Have the Resources Gone? +If you look up "mathjax setup" on Google, you are greeted with dozens +of links telling you how to get the rendering working client-side. It's +convenient: a single JavaScript `` tag, and maybe a bit of code +at the bottom of the page with some settings. + +If you look up "mathjax render server-side", the resources are far more +scarce. In fact, the first two image-based solutions I presented to you +came up as results after such a search. One more website looks promising: +[Antoine Amarilli's blog post about server-side rendering](https://a3nm.net/blog/selfhost_mathjax.html). +The page promises a self-hosted way of generating HTML using `mathjax-node`, +and even provides a very compelling sample output. I decided to try +out this approach, but to no avail. First of all, the MathJax infrastructure +has changed: + +> As mathjax has reorganized their repositories, to make the following work, you will probably need to install manually mathjax-node-cli, as well as maybe installing mathjax-node and possibly mathjax-node-page. Again, I haven't tried it. Thanks again to Ted for pointing this out! + +This was indeed the case. The bigger issue, though, was that the `page2html` +program, which rendered all the mathematics in a single HTML page, +was gone. I found `tex2html` and `text2htmlcss`, which could only +render equations without the surrounding HTML. I also found `mjpage`, +which replaced mathematical expressions in a page with their SVG forms. +This actually looked quite good - the SVGs even had labels with their +original LaTeX code. However, those labels were hard to read (likely +especially so for people using screen readers), and the SVG images +otherwise maintained most of the issues I described above. Additionally, +{{< sidenote "right" "sluggish-note" "the page behaved significantly more sluggishly after the initial render than the JavaScript-based alternative." >}} +This is purely anecdotal, and would require a more thorough analysis to +generalize. +{{< /sidenote >}} + +In short, it's much harder to find resources for server-side LaTeX rendering. +It is especially hard to find such resources that work with more +than static pages. I could probably use a regular expression to extract +math I need to render from HTML, call `tex2htmlcss` on it, and splice it back +into the page. But how could a WordPress user render their math? What +about someone writing their blog engine like I did in my youth? Somehow, +those of us wanting to give our users a better experience are left +fumbling for an alternative, more or less without outside help... + +### Conclusion +The majority of websites today use client-side, JavaScript-based rendering +techniques for mathematics. The work that could be done by a server +is outsorced to thousands of browsers, who have to run the same code +to get __identical results__. Somehow, this is the best solution - +the most accessible alternatives use images, which are a downgrade +to the user experience. I wish for server-side rendering to become +more common, and better documented.