Author: @GrammAcc
Published: 0
The Tech Stack of grammacc.dev
I felt like an appropriate topic to write about for my first blog post on this site would be this site itself, so I'm going to delve into the tooling and workflow that I'm using to build and maintain this site.
Miss me with that framework
I'm a backend developer, and I find the complexity of the modern frontend ecosystem exhausting. I've recently started using a no-framework solution for my personal projects when I have to build a web frontend, and I decided to try it out for this site as well.
Modulr is a python script I wrote that stitches together HTML fragments in order to facilitate modular web pages in raw HTML. I'll write a separate article explaining how Modulr components work in detail and the motivation behind that project at some point, but the tl;dr is that this tool allows me to extract common sections of the site like the header, footer, and nav into reusable components without the runtime cost of native web components or the complexity of a frontend component framework.
A Modulr component is an HTML file with an optional JS/TS script and/or CSS stylesheet
associated with it. The modulr.py script looks for structured comments in an HTML
page and replaces them with the contents of the corresponding component file along with
links to the associated resources. For example, if I had the HTML file
components-dir/nav.html
:
<nav>
<button class="nav-btn">Home</button>
<button class="nav-btn">About</button>
</nav>
I could write the HTML page source-dir/index.html
:
...
<body>
<header>
<!-- modulr-component :nav: -->
</header>
...
</body>
...
And the modulr.py script would produce the output HTML file
output-dir/index.html
:
...
<body>
<header>
<nav>
<button class="nav-btn">Home</button>
<button class="nav-btn">About</button>
</nav>
</header>
...
</body>
...
If I had a nav.{ts|js|mts|mjs}
script file next to the
nav.html
component file, then
an appropriate <script> tag would be included in the produced output as well, so
I can easily create modular web pages with static HTML and JS/TS.
Because modulr.py only parses html files, it doesn't interfere with any of the other tooling in a typical node project, so using it with Tailwind, Typescript, bundlers, or anything else is simple. Also, because it only does one thing, it's easy to see where it's limitations are and when I need to write something else to automate something. For example, the Articles button in the top left corner of the site opens a dynamically populated menu with all of the articles I've written sorted newest to oldest. At the time of writing, there's only one item in that menu, but I promise it's dynamic and sorted.
In order to dynamically populate the menu without a backend or database, I needed some way to store metadata about all of the static pages on the site and then access that metadata from javascript at runtime. I wrote the following Python script to do exactly that:
# ./mkmeta.py
#!/usr/bin/env python3.12
from pathlib import Path
import json
import sys
from bs4 import BeautifulSoup as BS
rootpath = Path("src")
try:
rootpath = Path(sys.argv[1])
except IndexError:
pass
output: list[str] = []
for root, dirs, files in rootpath.walk(on_error=print):
if "components" in dirs:
dirs.remove("components")
filepaths = [root / file for file in files if file.endswith(".html")]
for fp in filepaths:
with open(fp, "r") as html_file:
soup = BS(html_file, "html.parser")
metadata = soup.find_all("meta")
new_dict = {
i.attrs["itemprop"]: i.attrs["content"]
for i in metadata
if "itemprop" in i.attrs and "content" in i.attrs
}
url = str(fp).removeprefix(str(rootpath))
new_dict["url"] = url
output.append(json.dumps(new_dict))
print("\n".join(output))
The above script parses the <meta> tags in every page and creates a list of JSON-formatted object strings representing the page metadata including the URL to the page from the site root. The script then prints these JSON-encoded strings to stdout joined with newlines. This allows me to pipe the structured metadata into other programs. This is useful since I can easily use jq to filter the output in order to pipe different kinds of pages to other programs for further processing of the metadata. In particular, I wrote this additional Python script to parse the metadata for articles into a structure that I can use in javascript:
# ./mkpagedb.py
#!/usr/bin/env python3.12
import sys
import json
json_data = []
if sys.stdin.isatty():
try:
input_data = ",\n".join(sys.argv[1].splitlines(False))
print(input_data)
json_data = json.loads("\n".join(["[", input_data, "]"]))
except IndexError:
print("Error: no input provided")
raise SystemExit
else:
input_data = ",\n".join(sys.stdin.read().splitlines(False))
json_data = json.loads("\n".join(["[", input_data, "]"]))
output_lines = [
"export const ARTICLES = [\n",
]
for i in json_data:
object_lines = json.dumps(i, indent=2).splitlines(False)
for line in object_lines:
new_line = line
if line != "{" and not line.endswith(","):
new_line = "".join([line, ",", "\n"])
else:
new_line = "".join([line, "\n"])
output_lines.append("".join([" ", new_line]))
output_lines.append("]")
print("".join(output_lines))
This script accepts a JSON-formatted string as its input and prints valid
javascript as its output. I can use these simple scripts together like so:
python mkmeta.py | jq -c 'select(.category == \"article\")' | python mkpagedb.py > src/pagedb.mts
import { ARTICLES } from "/pagedb.mjs"
You might be thinking that it's unlikely that this particular workflow would change much, so I could just handle the entire thing in a single Python script and cut out the commandline middle-man. You'd be right! But filtering the JSON in Python would be more code and less readable than the jq solution. Also, with this pipe-based approach, I can generate metadata arrays for other kinds of pages with minimal changes to these scripts.
For example, if I decided I wanted to also host toy SPA applications on this site
in addition to blog articles, I could build and serve them statically at specific page
URLs and then use jq to filter the first script's output based on
category==spa
or something
like that and then pipe that output into a different Python script for building the
metadata structure that I need for the SPA pages.
Ultimately though, these scripts are small and simple enough that even if I have to completely rewrite them at some point, that will only take an hour or so.
Performance concerns
One thing that I was worried about with this approach was that the dynamic articles menu would slow down page loads since I was generating the links and populating the menu dynamically at runtime. Obviously, the performance requirements of a personal blog are pretty low, but I want my pages to load as instantly as possible. With this in mind, I wrote a Quick N' Dirty ™ script to generate an arbitrary number of articles and rebuilt the site to see how it performed when served from my local Nginx server.
There was no performance impact at 100, 1000, and 10,000 articles, but I ran into a second or two of lag when opening the articles menu immediately on page load at 100,000 articles. There was also an increase in the build time for the project, but it was still less than 30 seconds total at 100,000 articles on my machine. It turns out, my expectations for javascript's performance have been corrupted by the performance of similar operations in bloated frontend frameworks. This is a welcome surprise.
Obviously, this will vary between client machines and browsers, but if there is ever a point in my life when I've written 10,000 blog posts, I'll be unemployed. So I don't think performance will ever become a concern for this site with this approach.
But wait...
Isn't that just a static site generator?
Yes, for the most part. Is that a problem?
I think one of the major contributing factors to the explosion of complexity in web development over the last decade is that engineers are unwilling to write their own tools. We're taught that we shouldn't reinvent the wheel, so it makes sense to just use a FOSS library or tool to solve the problem instead of writing our own bespoke solution, but there is more to consider when making these kinds of decisions.
First of all, taking on any kind of dependency is an inherent risk, and with the way that transient dependencies balloon in modern web stacks, it's not feasible to sufficiently audit any third party code that we include in our projects. This being the case, the use of any third party tool should be very carefully weighed against its benefits.
The second problem with this kind of thinking is that third party tools are generalized solutions. It could be that our use case is so simple that 50-100 lines of Python could solve the problem, but we usually still choose to include hundreds of transient dependencies and a full suite of tools to solve every possible permutation of our problem in an effort to avoid duplicating work. In the process, we create a huge amount of additional maintenance work for ourselves. Application developers who have never done any library development are more susceptible to this fallacy. Taking a personal/specialized tool and generalizing it for use by other developers will teach you a lot about the strengths and weaknesses of a third party library. Sometimes, it's better to just write a specialized solution that is simple enough to rewrite completely when needed than it is to use a generalized third-party solution that requires you to structure your workflow around it.
But SSGs are simple!
I agree! Static site generators are great for prosaic sites that just need to get content out quickly. They are also awesome for working with non-technical content creators. The problem is that each one is its own ecosystem with its own ways to alter the layout, styles, and interactivity of the site.
If I was working on a company project where other developers would have to onboard later to maintain my work and/or non-technical team members would have to create content for the site, I would definitely use an SSG or CMS. For a small personal project though, rolling my own stupid-simple solution is a lot easier.
I actually originally planned to use Pelican for this site. It has a really simple workflow, and it allows customizing the site with Jinja templates instead of some CMS-specific format, so for someone like me who's used to spinning up sites with Flask, it seemed like the perfect tool. Unfortunately, I quickly found that learning all of the config options I needed to build the site how I wanted and then putting together the Jinja templates for a theme was just more mental overhead than I wanted to put into a little blog site. Pelican is an awesome SSG, and I recommend it for anyone who is looking for a simple but customizable solution to static site generation, but for this site, rolling my own solution was just faster.
Conclusion
I doubt that this kind of bespoke static website parsing workflow would scale to the kinds of applications that most companies are trying to make. But at the same time, I think it's important to realize that most of us aren't Netflix.
Scalability is hard. It's hard because scaling a system ultimately involves making architectural changes to the system. Even if you choose an architecture that is supposedly scalable, you will have to make some changes to it when it comes time to scale. No matter how much experience an engineer has, they can't possibly know every use case that they will have to deal with at scale. This means that some assumptions have to be made up front, and those assumptions are always wrong in at least some of the eventual use cases of the application.
At the end of the day, the easiest architecture to scale is the one that is so damn simple that we can just throw it out completely and build something else that fits the newly clarified requirements.