The Lindy blog engine
Ever since reading Nicholas Nassim Taleb's Antifragile I've been fascinated by The Lindy Effect, and how it relates to software. From Wikipedia:
The Lindy effect is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age. Thus, the Lindy effect proposes the longer a period something has survived to exist or be used in the present, it is also likely to have a longer remaining life expectancy.
The Analog Moment blog engine
This blog has always been powered by a bespoke blog engine. Through its life, it has been through numerous technological shifts, and it has typically been a programming playground where I indulged technologies and patterns I wanted to learn but couldn't justify in a professional context. Over the years, the blog has been ported from CoffeesScript to ES6 to TypeScript, and from Capistrano to Docker to Heroku.
The running joke is that I've spent far more time rewriting the engine than I have blogging. I've been happy to justify as I enjoy the learning experience I get from this experimentation, but I finally decided I want to do less blog engine development, and more actual blogging. However, one nagging opportunity for procrastination remained.
Maintaining this blog incurred development costs other than those imposed by my self-inflicted re-writes. Namely, the security rot of deployed code and the corresponding breakage caused by upgrades. Node.js versions go out of date, operating systems and Docker images need updating, and that's before we say anything about NPM package breakage - keeping a deployed service secure and up-to-date was surprisingly demanding, even with an extensive test suite.
So I thought to myself - can the Lindy effect give me a framework for building a blog engine that doesn't rot?
The Lindy re-write
The basic idea was to resist all urges to embrace hot new technologies, and instead prefer choosing older technologies which have remained relevant, with the theory being this would reduce the probability of the underlying technology either changing dramatically or becoming unmaintained.
Static rather than dynamic
An early realisation was that a big way to avoid exposure to technologies that might require maintenance was to reduce the amount of runtime code. Analog Moment had always used an express.js server to pull blogs from a redis data store at runtime, performing rendering on the fly. However, the amount of content on the site and the frequency of new posts means an upfront rendering of all the pages on the site in a single build step is a viable option. This rendering could produce a static directory of HTML files, which then just needs hosting somewhere.
express.js was originally released in 2010, giving it a Lindy lifespan of 11 additional years. Serving static HTML files goes back to 1993, giving it a Lindy lifespan that would almost see me through till retirement.
Static HTML also has the benefit of simplifying my deployment requirements - numerous hosts offer static HTML serving, so I need not worry about being locked into a vendor and having to maintain extensive vendor-specific deployment code.
Site generation frameworks
There are lots of off-the-shelf static site generators, but which does best on the Lindy test? Gatsby (2015), Hugo (2013) and Next.js (2016) are popular but newer than express.js. Jekyll (2008) fares a little better, but still only promises 13 years of Lindy life, which is less than I'd like for something that I'd be coupling my blog posts to.
Fine, I thought, I'll build one myself - how hard can static HTML generation really be? My requirements are simple and I'm only building a tool for myself. As long as I stick to the built-in's of the language I should be able to avoid coupling myself to technologies which are likely to require too much maintenance.
I've been looking for an excuse to learn Go, but having first appeared in 2009 it's too young. Next I considered Ruby, a language in which I'm familiar and have great fondness for. Ruby was first released in 1995, which is not bad, but can we do better?
How about C? OK, it was originally released in 1972 and remains used today, so it scores well on the Lindy test, but there's no way it's at the appropriate level of abstraction for the task at hand.
I eventually settled on Python. It's boring, but it remains popular and it dates back to 1991, promising 30 more years of Lindy goodness!
The posts for Analog Moment are all written in Markdown, which dates back to 2004. Not quite as good as the tools we've chosen from the mid 90s, but probably an acceptable choice as it's what I'm already using and it remains ubiquitous.
But what of the additional metadata that needs storing about posts (slugs, timestamps, title)? How should I package those up with the posts? The pre-existing implementation used JSON, which RFC'ed in 2006. YAML is slightly older, first released in 2001, but it requires installing a third party package to use in Python, which didn't seem worth it to attach three fields to some Markdown.
In the end, I decided to define my own template format to avoid coupling myself to anything. The slug is read from the filename, then each file looks like this
title: Nice post timestamp: 2021-09-07T19:53:44+00:00 \body: post content goes here
Finally though, I've got to actually convert that markdown to HTML, and this ended up being the weakest part of the stack from a Lindy perspective. I initially used python-markdown (2008, possibly earlier) but switched to commonmark.py (2015) as a renderer. I switched because commonmark.py is designed to conform to the popular commonmark variant of Markdown, which seems like the pragmatic bet for long-term maintainability. It also had type annotations, which python-markdown did not.
To avoid deeply coupling my code to any particular library, I implemented a small wrapper class around the library API to make switching markdown renderer a drop-in replacement (the validity of this approach was proven when the switch was trivial).
Things fell apart a little with correctness enforcing tools. I chose pytest (2009) as a test runner. I probably could have gotten away without this, but I couldn't resist the convenience.
And I broke the rules pretty badly for linters, using flake8 (2010), mypy (2012) and black (2018!). However, this felt justifiable given that none of these tools actually provide any of the functionality of the blog engine, they only support the quality and correctness of the code. If any of them were to become unmaintained or stop working, the blog engine would remain functional (although I'd likely seek a replacement)
CI and deployment
I broke the rules again here. Because the core output of my tool is a directory of HTML and therefore extremely portable, I chose to treat the coupling to my deployment as disposable. In the end I chose Google's Firebase hosting as it's astonishingly simple to deploy to and provides a CDN. However, I anticipate few challenges if I'm forced to migrate to a different host, as support for hosting static HTML is widespread and unlikely to become unsupported any time soon.
I'm using GitHub Actions as my CI runner, which again I'm treating as disposable - all the CI does is run the test and linting scripts, then trigger the deployment. The whole pipeline is less than 100 lines of code.
Non-Lindy strategies against code rot
In attempting to build a blog that would be easy to support in the long term, I also made some decisions that weren't informed by Lindy
I've long been a Test Driven Development (TDD) fan, and I'd seen the benefit of having a comprehensive test suite while evolving the previous version of the blog engine, so an extensive test suite was a no-brainer. The re-write has both fully-integrated tests, to and fully-isolated unit tests.
I chose to enforce type annotations in the Python code with
mypy's "strict" mode. Type annotations were only specified in PEP 484 in 2014 and are completely optional in the language. Additionally, how community adoption for annotations will fare in the long-term remains an unknown, so this is a slightly speculative bet on my part.
Building rather than "buying"
In response to not finding tools that meet my Lindy test, my default answer has been "can I build this myself?". Now, arguably this is anti-Lindy, as I'm forgoing older, tested code in favour of non-existent code.
However, when writing the code myself, I am writing only for myself. Pairing TDD with YAGNI, I write only enough code to meet my own requirements, and the code which I write is fully tested and meets my linting standards. I am not attempting to generalise anything for anyone else's use cases. This means fewer lines of code to maintain.
The downsides is I'm getting nothing for free. If I want a sitemap, an RSS feed or search feature, I'm going to have to build it myself. If I'd chosen to use a static site generator, these features would probably have all been built in.
I finished the re-write at the tail end of 2021 and chances are the page you're reading was rendered by that engine. You can see the source code yourself here:
At the time of writing I have no idea if this experiment has succeeded, and only time will tell. The engine works, it feels small but perfectly formed. There are some functionality gaps, and maybe I'll come to resent the effort that would be required to add, for example, pagination to the archive page. I'm intrigued to see how easy it will be to jump to newer versions of Python.
But crucially, I have one less excuse for not writing more blog posts. If all goes well, I'll publish and update here using the same engine in 20 years time.