articles
Last night I went to see the Nova Twins in Bristol. We were super-excited; Nova Twins were on our “love to see” list. And we were not disappointed: it was excellent, intense and exhilarating, just as rock-metal should be.
🤘
Take the chance to see them if they play near you.
Simon Højberg expresses a sentiment I think I agree with. I’m pretty sure that I’d find agent baby-sitting much less fun than writing code.
LLMs seem like a nuke-it-from-orbit solution to the complexities of software. Rather than addressing the actual problems, we reached for something far more complex and nebulous to cure the symptoms. I don’t really mind replacing
sed
with Claude or asking it for answers about a library or framework that, after hours of hunting through docs, I still seek clarity. But I profoundly do not want to be merely an operator or code reviewer: taking a backseat to the fun and interesting work. I want to drive, immerse myself in craft, play in the orchestra, and solve complex puzzles. I want to remain a programmer, a craftsperson.
This is all about doing live migration of VMs that have attached local storage. So the storage needs to move alongside the compute — and it has to physically move, block by block, from the old hypervisor’s local disk to the new hypervisor’s local disk. How do you do that without a horrible stop-the-world for your customers’ applications?
I always wondered how this was done, and this post gives the shape of one approach to the problem. Enjoyed.
The Linux feature we need to make this work already exists; it’s called
dm-clone
. Given an existing, readable storage device,dm-clone
gives us a new device, of identical size, where reads of uninitialized blocks will pull from the original. It sounds terribly complicated, but it’s actually one of the simpler kernel lego bricks. Let’s demystify it.
In ToyKV compaction: it finally begins!, I noted that I’d finally started writing a simple compactor for ToyKV, a key/value store I’ve been writing (to learn about Rust and writing databases) based on an LSM-tree structure. The idea is to have a working database storage engine, albeit not a particularly sophisticated one.
A really important piece of an LSM-tree storage engine is compaction. Compaction takes the many files that the engine produces over the course of processing writes and reduces their volume to improve read performance — it drops old version of writes, and reorganises the data to be more efficient for reads.
I’d avoided working on this because getting first version built was a large chunk of code. As I mentioned in the post above, by breaking down the task I was able to take it on step by step. And, indeed, Simple compaction v1-v7 by mikerhodes #25, is both large (2,500 new lines of code) and proceeds in a step-by-step manner.
Now lets talk about a few of the bits of code I’m most happy with. Nothing’s perfect, but I tried to lay a good grounding for adding more sophisticated compaction algorithms later.
Simon Willison links to an idea that I immediately fell in love with, and will obviously never use. But that someone has done it makes me somehow happier.
I Saved a PNG Image To A Bird. Benn Jordan provides one of the all time great YouTube video titles, and it’s justified. He drew an image in an audio spectrogram, played that sound to a talented starling (internet celebrity “The Mouth”) and recorded the result that the starling almost perfectly imitated back to him.
Benn himself further says:
Hypothetically, if this were an audible file transfer protocol that used a 10:1 data compression ratio, that’s nearly 2 megabytes of information per second. While there are a lot of caveats and limitations there, the fact that you could set up a speaker in your yard and conceivably store any amount of data in songbirds is crazy.