Jekyll + MathJax + CMU Serif = Machine Learning Blog
After fighting battles with WordPress on and off for over a year, I decided to look for new blog software that would give me the following:
- Attractive \(\LaTeX\) markup that integrates seamlessly into posts
- The ability to embed Computer Modern fonts
- Powerful support for code snippets
Essentially, I wanted a blog that looks like a mathematics paper with the occasional appearance of some lines of code. A dream blog, if you will, for a mathematician, theoretical physicist, or data scientist.
Enter Jekyll, a worthy alternative to WordPress.
When Donald Knuth’s wonderful Computer Modern font is used, we get a blog with the potential to show mathematics properly. When equipped with MathJax, Jekyll beautifully renders both inline mathematics like \(I_p(x) = -\log p(x)\) and display-style mathematics:
\[H(p,q) = \sum_{x\in X} p(x)\, I_q(x).\]Code snippets also look great:
As expected, the setup was non-trivial, and there are challenges to typesetting mathematics with Jekyll that may be unfamiliar to those who have experience with LaTeX. I’ll review some of these issues here.
Jekyll
Jekyll is a Ruby gem (i.e. a library written in the Ruby language). Provided that you already have Ruby, RubyGems (Ruby’s standard package manager), GCC, and Make installed, then installing Jekyll is straightforward:
The first command uses RubyGems to install both Jekyll and Bundler, the latter of which is a package manager that, among other things, checks whether all gem dependencies are available for a given application. The second command creates a new local directory for a Jekyll website. After changing to that website’s directory, we can serve the website locally with the last command. Local serving gives us the opportunity to develop and debug our website.
The next step is to create an index.html
to verify the installation:
After we see our Hello World page, we can do one of two things. Either we can get a handle on general Jekyll website design by reading the amazing step-by-step tutorial, or we can git clone
a pre-designed theme and then repurpose it for our needs. I chose the latter, picking Jekyll’s Minima theme. After cloning it to a separate directory, I created a new directory called “Math-Minima” and copied the Minima project into Math-Minima.
Minima has the following directory structure, which is typical for a Jekyll theme:
These directories consist of the following:
_includes
: HTML and JavaScript for headers and footers; JavaScript for search engine optimization_layouts
: HTML templates for various types of pages_posts
: Markdown files for blog post content_sass
: Stylesheets for headers, footers, fonts, links, etc.assets
: A CSS directory in which the main stylesheet is built; other accoutrements for the website (images, video, etc.)script
: Shell scripts for building the website
After a Jekyll theme is completed, we build our website in three steps. First, in the theme directory we input to Bundler a .gemspec
file and get as output a .gem
file containing all the visual design information for our site. This is done with a command that reads something like gem build math-minima.gemspec
. Next, we move to the website directory and indicate our website’s gem dependencies (including the version of the theme) in the website’s Gemfile
. We then use the command bundle
to produce a new incarnation of our blog.
MathJax
MathJax is an open-source JavaScript display engine for LaTeX and other mathematics markup languages. One can find MathJax on content delivery networks all over the web. In particular, many variants of MathJax can be found at CDNJS.
To use MathJax, we place the following code in _includes/head.html
, immediately before <\head>
:
CMU Serif
To incorporate CMU Serif fonts, we work in the assets
and _sass
directories. In particular, we place a fonts
folder, consisting of CMU Serif fonts, inside assets
. This folder and the files in it will be referenced in a CSS file in the _sass
directory.
In _sass
, we have math-minima.scss
as our main stylesheet. This stylesheet imports partial stylesheets located in _sass/math-minima
. The original Minima theme comes equipped with _base.scss
, _layout.scss
, and _syntax-highlighting.scss
. The partial _fonts.scss
is our new addition. It contains @font-face
rules that upload CMU Serif to our end users.
To maximize compatibility among different browsers, it’s good practice to source as many file types as possible of the same font:
TeXing with Markdown
Markdown is a markup language that is used for the rapid production of rich text documents, such as Jupyter notebooks and Jekyll blog posts. Typesetting mathematics with Markdown is relatively straightforward, though there are some quirks that merit attention.
Inline mathematics is produced with two dollar signs instead of one. For example, $$\tilde G = \mathbb{R}^2\ltimes O(2)$$
gives us \(\tilde G = \mathbb{R}^2\ltimes O(2)\). Alternatively, one could also use \\(\tilde G = \mathbb{R}^2\ltimes O(2)\\)
. Note the double backslash. The first backslash is used to escape the second for the proper syntax.
Display mathematics may also be produced with two dollar signs, provided that a carriage return precedes the first pair of dollar signs and succeeds the second pair. Another option is to use double backslashes. For instance, \[ H(p,q) = -\int_X p(x)\, \log q(x)\, d\mu(x)\] is typeset via \\[ H(p,q) = -\int_X p(x)\, \log q(x)\, d\mu(x)\\]
, which mimics LaTeX’s shorthand for a display equation.
For aligned equations, we have blackslash explosion. For example, to produce
\[
\begin{align*}
r &= \begin{pmatrix} R & 0 \\ 0 & 1 \end{pmatrix} \\
t &= \begin{pmatrix} I & T \\ 0 & 1 \end{pmatrix}
\end{align*}
\]
we write
\\[
\begin{split}
r &= \begin{pmatrix} R & 0 \\\\ 0 & 1 \end{pmatrix} \\\\\\\\
t &= \begin{pmatrix} I & T \\\\ 0 & 1 \end{pmatrix}
\end{split}
\\]
Aside from these minor deviations from standard LaTeX, I haven’t run into any issues. These keyboard gymnastics seem like a small price to pay for the display quality.
Deployment
There are many ways to fire up a Jekyll website, and Jekyll’s deployment page addresses options for both manual deployment and automated deployment.
Once built from a Jekyll theme, a Jekyll website has the following directory structure:
A simple Jekyll website development workflow is as follows. We can run jekyll serve
in our website’s root directory, open the website by typing localhost:4000
in our browser, edit our Markdown files, then view our changes by refreshing the website. The Markdown files for blog posts are found in _posts
. Each time we save a change to a Markdown file, the Jekyll
server will “regenerate” the website:
Here, regeneration just means that the newly saved Markdown files are used to produce blog content as HTML files. These files are saved in _posts/_site
. The website content is stored in _site
. (Note: not _posts/_site
.)
After blog content is written and vetted, we need to put it up for other people to see. There is no shortage of methods for doing this, and the choice of method depends upon a developer’s needs.
In my case, I have a web server that was configured for use with Wordpress. When I first purchased it, I didn’t worry about server hardware, server software, root access, and so forth. Thus I ran into a buzzsaw of complications when I looked into deployment options.
Perhaps the simplest method in my case is to copy the contents of _site
to ~/public_html
on my web server. This, of course, is a boring, manual solution. Moreover, I was condemned to use FileZilla in order to get _site
up to the server, which didn’t appeal to me.
I decided to use Git to automate the deployment. I set up a bare Git repository on my web server. Next, I set up a Git repository in my local website’s root folder. I added the bare repo to my local repo as a remote, then pushed from the local repo to the bare remote.
It turns out that it’s not best practice to push to remote repositories that are not bare. In fact, Git will impede us from doing so! Assume that we’ve initialized a non-bare repository on a remote server, which is then added as a remote, called origin
, to a local repository. If we git push origin master
, then Git responds with:
The lesson here is that Git prefers that we not push into a non-bare repository whose checked out branch matches the checked out branch of the local repository. However, if the checked out branch of the remote is not the branch that corresponds to the content that we want to show, then we can’t deploy our content. On the other hand, if our remote repository is bare, then it won’t contain our website information! In particular, we won’t be able to access _site
from our remote repository.
One technique for dealing with this issue is to set up a second repository on the web server. This second repo will be non-bare. The repo can git pull
from the bare repository each time new content is pushed from the local repo to the bare remote repo. Git’s hooks can automate these pulls.
In this case, we need a post-receive hook. This will be a Bash script that indicates the location of the work tree in the non-bare repo, then issues the pull command for the non-bare repo:
The final piece in this puzzle is to symbolically link public_html
to _site
, which in effect moves the contents of _site
to public_html
. At last, we can develop locally, then publish our blog content with a simple git push origin master
command.
Further Comments
A more sophisticated workflow employs a post-receive hook to automate the Jekyll build process. This is covered in Jekyll’s Automated Deployment page. In my case this doesn’t work, because my server has outdated versions of Ruby and RubyGems, and I lack the root access needed to install newer versions. It’s also not necessary for me to build on the server.
Another option is to deploy directly from GitHub by using GitHub Pages, but I’m not sure how to incorporate CMU Serif using this method. I also like the idea of having my name in my URL, and GitHub Pages doesn’t permit this.
The most sophisticated and flexible deployment method seems to use a continuous integration tool such as Jenkins, Travis CI, or CircleCI. These tools, however, are meant for scalable deployment and seem to be overkill for my purposes.
I really enjoyed this exploration, as it gave me a chance to dig a little bit into front-end development. I’m definitely looking forward to blogging about mathematics with a proper font and a proper typesetting framework!