Late May Journal: building things I said I didn’t need

I spent my “projects time” in the latter half of May working on my AI apps, Rapport, codeexplorer and a bit on my other ai-toys.

First off, after saying that I wasn’t sure whether it was worth adding tool support to Rapport, I ended up going all the way and adding support for connecting to local MCP servers.

Second, I decided that codeexplorer deserved to graduate its own repository. It felt like it had outgrown being part of ai-toys.

Finally, I wrote a streamlit UI around the image generation capabilities of GPT-4o. No more “you must wait 15 hours to make more pictures of gigantic anime cats” error messages in ChatGPT for me!

Read More…

Link: All People Are Created Educable, a Vital Oft-Forgotten Tenet of Modern Democracy

In All People Are Created Educable, a Vital Oft-Forgotten Tenet of Modern Democracy, we are reminded of the importance of education in the functioning of democracy:

Many shocking, new ideas shaped the American Experiment and related 18th century democratic ventures; as an historian of the period, I often notice that one of the most fundamental of them, and most shocking to a world which had so long assumed the opposite, often goes unmentioned — indeed sometimes denied — in today’s discussions of democracy: the belief that all people are educable.

Read More…

April & early May: AI, but perhaps to saturation

Again, most of my spare time was dedicated to AI learning and experimenting:

  • I continued reading and coding from Build a Large Language Model (from scratch).
  • I updated Rapport, my home-brewed chatbot app, a few times. I added a couple of major features and several quality-of-life improvements. I really like it now.
  • I did some further work on my codeexplorer ai-toy. Now you can ask a model to edit code, and I added support for more providers than just Anthropic. However, experimentally, Claude is still one of the best models for code editing.

Read More…

“Hello, I am Featureiman Byeswickattribute argue”

Thus concludes chapter 4 of Build a Large Language Model (from Scratch). Coding along with the book’s examples, I now have an untrained version of GPT-2, first released in 2019. When fed with “Hello, I am” as a prompt, the untrained model outputs gibberish. This post’s title is taken from that gibberish.

Next comes Chapter 5, which will cover the training that will take us from gibberish to intelligible text. But for this post, I wanted to take the time to capture my thoughts at this point in the book.

Rather than explaining concepts that others have covered better, I’ll share my stream of consciousness about how fascinating and weird it is that this stuff works at all.

Read More…

Build an LLM (from scratch): pt1

Two weeks in and I’ve got through about three and a half chapters from Build a Large Language Model (from scratch). As I suspected, it’s a much more time-consuming — frankly, just harder — read than AI Engineering was. I’ve spent about an hour each night with both the book and a collection of background reading. While challenging, it’s been really fun getting properly into this topic. I look forward to my daily hour of struggle!

I’ve written up a few brief thoughts on what I’ve read so far.

Read More…