toykv updates

Since my last post, I’ve committed a couple of updates to toykv. The first is a nice functionality update, the second is enabled by my learning a little more rust.

  • Implement delete · mikerhodes/toykv@2325ff1

    With this update, toykv gains a delete(key) method. I did this in essentiallly the way I outlined previously: by adding a KVValue enum that holds either a value or Deleted, then threading this through the library, including serialisation.

  • Use generics with Into<Vec> for API ergonomics · mikerhodes/toykv@53dc863

    This is a nice update that reduces the amount of boilerplate, especially in tests.

    Previously there was quite a bit of get("foo".as_bytes().to_vec()) code, which has now been reduced to get("foo") by making the get, set and delete methods generic.

    I think this can further be improved using Cow. I think that would avoid unneeded cloning of data. But that is for later.

Obviously the library is still a learning aide, but it’s getting closer to at least having the functionality you would want in a real storage layer.

Writing bits to disk: toykv

In the early part of my experiments in writing a very simple document database, I thought that writing the bits to disk would be too difficult for me. And perhaps it was, back then. But I think things have changed.

As I wrote more of the document database, I got more experience transforming JSON data into key and value byte arrays that could be saved, retrieved and indexed. And as I got more comfortable with the tiny data formats I was creating, particularly for the indexes, I began to think, “why not, I bet I could write the code that organises these on disk too”.

Of course, just as my document database is nothing like ElasticSearch, my on disk formats would be nothing like so advanced as sled, the key-value storage layer I was using in my document database.

(I still think that I’d need a bunch more practice in this space to be able to write something production ready; but I am starting to think that I need not be a Better Programmer, but instead merely More Experienced In This Subject. Perhaps, in the end, they mean the same thing).

But I thought it would have a lot of value, to me at least. Because I believed I’d learn a lot, even writing the simplest, silliest key-value storage engine. And I think I have, and to try to cement that knowledge into my head, I’m going to write a whistlestop tour of that code into this post.

Read More…

S3 Express One architecture deep dive

ToyKV now has code and structure that is almost tidy enough to write some more about.

In the meantime, I found this deep dive into some of the architecture designs of S3’s new low-latency storage a good read:

Recently, Amazon released their S3 Express One Zone Storage Class, advertised as the “fastest cloud object storage for performance-critical applications.” In this post, I’ll highlight three key features of Express One and share some guesses regarding their implications on S3’s internal architecture.

S3 Express One, Value-Less LSM Trees, ShardStore.

ToyKV teaser

Here’s a teaser of my followup to the build your own database index series. It’s the most minimal key value store you can write, and is extremely not useful. It does almost nothing you’d want an actual place you put data to do:

  • Doesn’t bother checksumming any data. Bit flips for the win!
  • Opens files every time it reads or writes them. Every. Single. Time.
  • Makes no attempt at all to compress data. Keys and values are just streamed to disk in all their bloated glory.
  • It’s an LSM “tree” with just one branch of SSTables.
  • And those SSTables are never compacted, and eventually you run out of them.

And it does none of those things so I have a chance to explain it in a reasonable number of words. For that purpose I think I’ve been successful. With a bit of code cleanup, it should be a decent pedagogical tool.

Here’s the main library code to give an idea of just how simple it is:

You can see how little data safety matters by (a) the complete lack of error handling, and (b) that the whole database has only seven tests. Put on your hard hats, and prepare for falling rocks 👷

But even with its general awfulness, I’m still excited about having written something that can store and read back stuff in a way that’s not completely trivial. More importantly, writing it has given me the foundation for learning the ways to write a better LSM tree data store.

While you wait for me to write about ToyKV, you can find the code at mikerhodes/toykv.

BYODI 5: plan, optimise, execute 🚀

Welcome to part 5 of the build your own database index series! This was supposed to be about finding documents, but I kind of got ahead of myself in the code, so we will skim over some of the details of finding documents and instead talk about moving from a naive query execution to a (very slightly) more sophisticated two-stage process:

  1. The query planner takes the caller’s query and creates physical operations from it. Physical operations in this case are index scans — instructions to return document IDs between a start and end key in the index.
  2. The query executor takes the list of index scans produced by the planner and executes them, creating a final list of document IDs by intersecting the results from each scan.

The original search_index method mangled these two stages together. It also performed no pre-processing of the query, and naively executed each predicate in the query exactly as they were passed in.

We can do better than that. Once we have extracted the planning and execution phases into their own methods, we’ll look at performing a simple optimisation of the index scans we do when performing a query.

Read More…