Link: A 175-Billion-Parameter Goldfish

A GPT language model can only use about 4,000 words of context when generating its next words. The resources required for generation in a GPT increase significantly as this “context window” expands. Unlike a normal Google search which takes milliseconds to run, generating responses in ChatGPT, for example, takes on the order of seconds. That’s expensive.

I’d not thought through the implications of this limited number of tokens a GPT large language model can use when generating its output. As Allen Pike explains in A 175-Billion-Parameter Goldfish, as well as dollar-cost, this has deep effects.

So let’s say, oh I dunno, you want GPT to behave as a nice friendly search assistant called Bing. You might fill its initial context with 3,000 words worth of stuff like “Your responses should avoid being vague, controversial or off-topic.” Now, when a user types a question, you’ll tack the question after the default prompt, pass it all in as context to GPT, and likely generate a reasonable answer.

However. As the user engages with the chatbot, generating answers and responding in kind, their conversation history also needs to be added to the context. But with the 3,000-word limit, the oldest part of the context can eventually no longer be fed in for successive prompts, falling out of the “context window”. In a naive implementation, this leads to GPT slowly forgetting everything that it was prompted with as your conversation gets longer. By the time you engage GPT in 3,000 words worth of arguing and passive aggression, it’ll want to join in, insisting that you are a bad user.

This extremely truncated memory also screws up other obvious use-cases. For example, where you want the GPT to be an assistant. Because it’ll forget everything you tell it pretty quickly. An assistant needs a library of stored facts, but right now AIs have nowhere to put them. Even your name will soon fall out of the AI’s world view.

There are a lot more use-cases where repeated memory loss causes problems, but Allen goes on to describe one where the loss of context — and the tendency to confidently hallucinate its own facts — may not matter. It’s a good read.

← Older
Photo: wooden tiles
→ Newer
Link: Bashing out words is easy, writing is rather more difficult