February Journal: building my own chatbot, and falling for python all over again

Three things came together to inspire me in early February:

First, the idea that AI has reached a place where it’s easy to start building things that once seemed years away. This was inspired by Our own agents with their own tools. | Irrational Exuberance.
Second, the fact that you can run reasonably competent models locally using Ollama. There are other tools, but the completely local thing really caught my imagination.
Later, I also bought AI Engineering which filled in a lot of the gaps left by blog posts and a few papers. Highly recommended.

But this post isn’t about the complexities of AI. Instead, it is about the simple joy of falling back in love with Python while catching up with the rest of the world — the coding world, at least — on the uses of LLMs and AI.

I started, like almost everyone ever, by building a chat bot. But my own chat bot, and that’s what makes all the difference.

Building one’s own toolset

I have always loved building some of the tools I use. Mostly small things, like the day-to-day customisations that I’ve built into my Hammerspoon setup, such as hotkeys to move my windows around my screen. Building that with Hammerspoon’s primitives rather than using a pre-built application allowed me to create exactly what I needed.

While I’d been following along with what was happening in AI, and LLMs in particular, I was mostly observing. It’s a fascinating area, and the tools that are being built were and are impressive. But my employer is fairly wary of AI tools and I’d not found a safe way to make use of them.

However, the idea of building something entirely locally really pushed my buttons, and so I finally got around to opening my editor to build … something. I wasn’t quite sure what to start with.

In the end, I built what a thousand people before me have most assuredly built, perhaps tens of thousands: my own basic chatbot. Frankly I was a bit disappointed in myself. It felt like I should be doing something more inventive. In the end, however, LLMs generate text, and it quickly became clear to me that the best first thing to do was to go with that flow. Once I understood that, I hoped other things would start to fall into place (and they did).

And so mikerhodes/rapport was born.

Rapport: (noun) a close and harmonious relationship in which the people or groups concerned understand each other’s feelings or ideas and communicate well.

I’m not sure yet whether you can have a real rapport with an LLM, but the word rings nicely.

In building Rapport, I also discovered:

That python’s type system has matured massively since I last wrote Python in anger in 2017-ish. Writing typed python with the assistance of pyright and ruff has been a dream. I’ve fallen for python all over again (Go and Rust, I still love you guys too).
Streamlit. Streamlit makes building the interface of a chat app trivial. From what I can tell, it was started by Snowflake to make it easy to build internal apps based on data (smart move for an analytical database maker like Snowflake). I assume at some point it pivoted to bring chat bot apps into its remit; I’ve seen several prototypes now that are clearly Streamlit. I’ve enjoyed learning it. Would use again (and again).

I’ve been super-excited to have found so much awesome in python in 2025 🐍🌟.

As noted, types are way more mature, and the tooling built around them just makes life better.

The last thing I wrote about Python was Python Packaging in 2020, and things were in a bit of a sorry state. Things have really sorted themselves out since then. It’s not perfect, but all-in-one tools like uv, clearly inspired by cargo and rustup, are great.

I’d be very happy to be doing my day-to-day work in 2025’s python.

Graduating to the frontier

I originally made Rapport to make it easy to chat with multiple models, to help me get a deeper feel for what today’s models were capable of.

I learned a lot about what a small model (1B to 3B parameters) could do. I saw that what they were able to do, they could do quickly, even on my laptop. I found that 8B models were clearly smarter, and that 14B models were similarly clearly more steps ahead again. As an example, the code suggested by phi4, a 14B model, was vastly better than codegemma, a 7B model. But codegemma was way faster to generate it on my machine.

For a while I settled on phi4 running locally. It is about the most powerful model I can run on my MacBook Air with its 16GB of RAM. Deepseek’s 14B distillation of Qwen worked okay too. But in the end they only whet my appetite for more, once I got a feeling for what they could do, and I tried out more and more things. I had hit the limitations of local models. As an example, my experiments showed that local models appear to find it hard to do function calling while also holding a conversation.

I’d used Claude to help a little with Rapport. I’d asked it to review my code as I was learning python typing, for example. It was clear it was much smarter and much faster. So Rapport grew features:

Talking to Anthropic APIs and IBM’s watsonx inference APIs (watsonx because I work at IBM so it makes things easier to connect to their models when at work, so I can use larger models).
Uploading files to the chat’s context. That was super-useful when learning the ins and outs of python’s types and asking for feedback on code.
Chat history. Currently chats are dumped to JSON, but I want to use SQLite to add history search (and perhaps “summarise conversations on X”).
Downloading a chat as Markdown. I have a vague idea of saving/summarising into files within my obsidian vault.

Overall, this month of using my free time to work on Rapport has given me much more of a gut feel on ways to use LLMs than just reading ever would have. I was right that using a chatbot model would allow me to more easily get up to speed with working with LLMs and their APIs. It’s already allowed me to create a small internal tool at work to help the team with our switch to ClickHouse SQL-based log querying; something I just wouldn’t have immediately thought of before as a possible solution. Because I didn’t know it was so easy.

And that’s the thing that I’ve learned through this work: it really is easy to build small things around language models. There’s so much work already done for you now.

Python and Streamlit are excellent ways to build tiny little focused chatbots like my “help with SQL” bot in days rather than weeks or months. I can imagine extending that bot to more use-cases by working a prompt library into the UX, and allowing prompts to be added via GitHub pull requests.

And, of course, larger models like Llama 3.3 and Claude have so much existing knowledge of the world — particularly the coding world — that small use-cases are often achievable with small efforts in prompt engineering.

As always, digging in and writing my own stuff pushes me forward. Pleasing.