This is a brief tale, told mostly through links, about subtlety. And fsync, though perhaps the two are synonymous.

While I’m writing about this in September, the events actually happened back around March; I intended to write this up back then, but somehow it just never happened.

Earlier this year, I read NULL BITMAP Builds a Database #2: Enter the Memtable. At the end, Justin Jaffray mentions a potential sad path when the database you are coding up (as one does) crashes. Here, we are talking about whether the database can accidentally lie to a reader about whether a write is on-disk (durable):

I do a write, and it goes into the log, and then the database crashes before we fsync. We come back up, and the reader, having not gotten an acknowledgment that their write succeeded, must do a read to see if it did or not. They do a read, and then the write, having made it to the OS’s in-memory buffers, is returned. Now the reader would be justified in believing that the write is durable: they saw it, after all. But now we hard crash, and the whole server goes down, losing the contents of the file buffers. Now the write is lost, even though we served it!

The solution is easy: just fsync the log on startup so that any reads we do are based off of data that has made it to disk.

If you’re anything like me, that will take you at least three reads to get the order of events straight in your head. But once I did, it felt right to me. As I work on a database, I thought I’d ask the team whether we did that. I was pretty sure we did, but it’s part of my job to double-check these things when I come across them.

Herewith, the story and the warning about subtlety.

Read More…

In June Journal, I left off with a thought to make toykv thread-safe and to implement simple compaction. I did get to thread-safety, which took a while, but I didn’t get to compaction. I did make several other improvements, however.

I also did some work on ai-codeexplorer. Here, I now have a work in progress UI using textual, a python terminal UI toolkit. I haven’t merged it to master yet as it’s incomplete. It was interesting to work with an event driven UI toolkit again, however, it’s been quite a long time since I’ve done that.

Overall, it wasn’t as productive as previous months, but school holidays do that to a person; there’s plenty of other things to do!

Read More…

Link
Enough With All The Raft

I really enjoyed this transcript of a talk. It’s an exploration of the design space of distributed consensus by Alex Miller, who I first came across several years ago when Cloudant were looking at FoundationDB; Alex was one of the primary authors. He knows his stuff. There are several great papers linked in the post.

This talk is an extension of my earlier Data Replication Design Spectrum blog post. The blog post was the analysis of the various replication algorithms, which concludes with showing that Raft has no particular advantage along any easy analyze/theoretical dimension. This builds on that argument to try and persuade you out of using Raft and to supply suggestions on how to work around the downsides of quorum-based or reconfiguration-based replication which makes people shy away from them.

Enough With All The Raft

Read More…

Maggie Appleton talks about something close to my heart: how can we use AI to help us think, rather than do the thinking for us?

But can’t we add a smidgeon of the harsh professor attitude into our future assistants? Or at least the option to engage it?

Sure, we can do this manually, like I did with Claude. But that’s asking a lot of everyday users. Most of whom don’t realise they can augment this passive, complimentary default mode. And who certainly won’t write the optimal prompt to elicit it – one that balances harsh critique with kindness, questions their assumptions while still being encouraging, and productively facilitates a challenging discussion. Putting the onus on the user sidesteps the problem.

Professor Bell and I are both frustrated that there is no hint of this critical, questioning attitude written into the default system prompt. Models are not currently designed and trained with the goal of challenging us and encouraging critical thinking.

A Treatise on AI Chatbots Undermining the Enlightenment

I really enjoy her “harsh professor” prompt, and am considering whether to use it in rapport.

Read More…

Post
June journal: toykv & zed, redux

At the end of Late May Journal: building things I said I didn’t need, I said:

I’m not quite sure what I’ll do next. I am starting to hanker a little after doing some more work on toykv, finishing up the range-scan functionality I never got working well. Somehow, the browser tab holding “Programming Rust” is still open after not having written any Rust for over a year now. Perhaps it’s time to blow the dust off and get back to it.

Like talking about tools and MCP servers inspired me to write Rapport’s MCP support, just thinking about toykv again stirred up enthusiasm for it.

And so, during June, most of my home coding project hours were spent rewriting much of toykv:

  • I updated the on-disk format to use a block-based storage layer.
  • I finished up the scan function that allows reading a range of keys. Previously you could only read a single key. The new block-based format involved writing a bunch of iterators that simplified completing this work.

Overall, this made toykv more … real. More like a real database would be structured on disk (though still far from being production quality).

I also found myself having another try at using Zed, as there are now efforts to use the increased focus on Vim functionality to support Helix style interaction.

Read More…