Links: September 2023

The challenges of building things with language models, problems with peer reviewing of scientific papers, MongoDB’s new query execution engine and comparing operational automation with Iron Man.

Enjoy.

  • Squish Meets Structure: Designing with Language Models

    I really enjoyed Maggie Appleton’s talk about designing with large language models. It’s a fascinating discussion of how we are trying to understand how to work with these new tools, which work in different ways to how we are used to.

    We’re trying to make an unpredictable and opaque system adhere to our rigid expectations for how computers behave. We currently have a mismatch between our old and new mental models for computer systems.

    And the slides are somehow beautiful.

  • The rise and fall of peer review - by Adam Mastroianni

    Peer review in the sciences has come in for a bit of a bludgeoning recently. This piece argues that it’s adding little value. If anything, that peer review is actively hostile to new ideas and serves mostly to entrench existing hierarchies (and a large publishing industry).

    Can we do better?

  • Inside New Query Engine of MongoDB | Nikita Lapkov

    In my day job at Cloudant, I think a lot about how we could make our database better. I enjoy any and all deep dives into how other systems work. I find it helps create a library of patterns that I can later match against as I dig into problems.

    Here we learn how MongoDB created an idea called slots, which they used to significantly improve the efficiency of moving data values through their query execution pipeline.

  • Automation Should Be Like Iron Man, Not Ultron - ACM Queue

    Also at Cloudant, we’ve found automation that increases and magnifies the abilities of our operators to be the best kind of automation. It allows us to manage more and more machines, while also increasing the flexibility we have in managing load across the system. Because it is magnifying our abilities, instead of replacing them, it allows us to maintain understanding of the system. Our understanding comes in very useful when things go wrong that automation does not cover.

    This article sits well with this experience, and I can see myself referring to it over time to explain our approach.

Mini-project: mdanchored

For a few years now, I’ve been kind of telling myself, on and off, “you should really learn Rust you know, it looks like it has legs and could be rather useful”. In that same way that you tell yourself, as you get older, in an absent-minded kind of way, “You should really settle down now” or “Really! By your age you should have acquired a taste for dark chocolate”. Basically: “do something serious and grown up”. And, while I did settle down, and get to like dark chocolate, I didn’t get around to writing any Rust.

But sometimes you have to make your own opportunities, and I had an idea for a tiny application, and instead of reaching for Go or Python, I reached for Rust.

It was fun, and I wish I’d done it sooner.

(Is Rust “grown up”? I’m not sure, but over time I’ve learned that I like learning the lower-level details of the systems I work with, and Rust seems like a good next step in that process.)

Read More…

Journal: British spellings in ltex-ls and Helix

While I wait for Helix to gain native spell-checking, I installed ltex-ls. This is a language server for the open source LanguageTool, which checks both spelling and grammar. By default, however, ltex-ls uses American English. As a Brit, accusatory little dots in the editor’s gutter because I had the temerity to use “u” in “favourite” quickly became irritating.

Fortunately, ltex-ls supports a lot of languages. However, I wasn’t sure how to to set en-GB as my language in the Helix configuration. In the end, I went to the languages.toml file in the Helix source code. This file is built into Helix, and sets its (programming) language defaults. It is a gold mine for understanding what’s possible in your own languages.toml file.

After poking about for a while, I figured it out. I’d read the Helix configuration documentation for language servers as saying that the config field in a language definition was usable only for passing formatting information to the language server. But that isn’t the case at all.

To set the language used for spell-checking, in your ~/.config/helix/languages.toml file (create it if it doesn’t exist), add the ltex.language option in the config block like this:

[[language]]
name = "markdown"
language-server = { command = "/Users/mike/bin/ltex-ls/bin/ltex-ls" }
config = { ltex.language = "en-GB" }

And there you have it 🌟

New Hires: Learn how the system breaks

I hope I remember this advice when I next start a new role:

Onboarding into mature systems often can be an extremely daunting, many month (or year) process. This is true for both managers and [engineers]. Failure streams are a short circuit to understanding the system, because failures are where the system is interesting and nuanced. Failures are where the heart of complexity, entropy, and flux in the system are.

Understanding failure modes helps you understand a system more quickly while simultaneously revealing high impact areas to work in. What a great piece of advice on how to start building trust with a new team.

Journal: Helix, Kitty, ClickHouse

I thought I’d take a stab at quick-fire journal entries.

First, we return to Helix and take a look at Kitty.

Then, a note about ClickHouse. Getting hands on with ClickHouse has helped me understand in practice what I previously only understood in theory (column-orientated datastores). In doing so, it’s expanded my horizons of what’s possible.

Read More…