Magic Seven and Code Understanding

Two concepts — one from computing lore, the other from psychology — should be discussed together: code-read/understand-ability and the magic number seven. This article is about them, and how almost every coding practice you have ever read should be seen through the lens of the magic number seven.

The magic number seven refers to the amount of information which a person can store in short-term memory, their immediately available working-set of data. This can be thought of as the number of slots in short-term memory into which things can be temporarily stored.

There are a couple of notes about this. Firstly, it is not absolute; some people seem to be able to store more items, some fewer. It is often abbreviated 7±2 to denote the usual range people can store. Secondly, seven slots does not mean only seven items, but seven “chunks”: if one can group things together in some way (into a chunk) then this chunk may only occupy a single slot even though it contains multiple entities.

As I’ve mentioned before, psychological research can provide hard data on anecdotal programming lore on better ways to write code. The magic number seven can be very useful when evaluating when to apply a given piece of advice.

For programming, I think of each slot in short-term memory as containing a chunk or item of context, each of which is required to understand what I am working on at that time.

An item of context can be many different things:

	- What the purpose of the class or method I am writing is.
	- What each of the methods I am using does, and how they interact.
	- How objects around this one expect it to work, and their likely usage patterns.
	- Each piece of data I am working with.
	- A level of nesting. For example, a conditional if-statement adds some context: "Only do these things when X is true".
	- and so on: anything required to understand what a given piece of code does.

The fewer items of context required to understand what a piece code does, the easier it will be to understand. If it is possible to fit all the context into short-term memory, it is easy to conceptualise and model the problem internally. If not then you must swap bits of context in and out of short-term memory, meaning you can never see the problem from all sides at once.

At first it may seem hard to see, but this rule of thumb applies at all levels of comprehension of a program, from inside a method to the overall program architecture.

To understand this take the problem of understanding a sizable piece of code, perhaps a whole program or sub-system. If you think for a short while, it should become clear the main problem is working out how to chunk the code so that it will all fit into short-term memory at once.

Say you have some code to alter, perhaps add a small feature or develop a bug fix, in this new code. Rather than just hack a solution, you want to understand a little of what is going on around the bug’s location so you don’t inadvertently break something.

At first the code appears to be a disjointed set of methods, and it is impossible to fit all these in mind at once. As you read and think about the code, patterns begin to emerge which help in chunking it: these methods work together, this method does this task and so on.

Each pattern helps to produce an abstraction of a portion of code, and each abstraction can be used as a chunk, thereby being a single item of context in other places. After a short while, the abstractions allow you to fit enough in each chunk you can hold the set of code you need to bear in mind — when you are implementing your fix — in short-term memory. Colloquially, you have a feel for the code; you can hold some representation of it in your short-term memory, close at hand.

Abstractions can be built into higher level abstractions recursively, eventually meaning the entire working of the code can be represented by few enough abstractions that you can hold a view of the entire piece of code in short-term memory and reason about what it does. As we have seen, these abstractions build up from the lowest levels of code.

As you move around the code and dive deeper into certain portions, lower level abstractions are drawn into short term memory to give you the context you need at that point in the code. Higher level abstractions can be discarded for a time as you concentrate on the detail.

This is why well chosen abstractions help at all levels of the code. They reduce the amount of context that has to be held at any time, therefore increasing the likelihood it can all fit into short term memory at once.

Each use of a piece of advice about managing complexity should be evaluated with this in mind: does it reduce the amount of context needed to understand a given piece of code? If so, it is useful, if not, discard it.

Here are a few examples:

	- Use few levels of nested statements: fewer conditions to keep track of for a piece of code means less context.
	- Create consistent levels of abstraction in object interfaces (or, An object should have only one job): you must only keep one level of context in mind, rather than juggling several, when reasoning about the object.
	- Hide implementation details: implementation is another item of context over and above the job of the object. Make sure this information isn't needed to use the object, reducing needed context to interact with the object.
	- Introduce componentisation: another example of a higher level of abstraction. In an ideal world, an entire component is just one item of context, no matter how much code it hides.
	- Avoid threads, they are hard: you add several new dimensions of context when using threads because any aspect of the object's state could be changed whilst your code is executing; your code is no longer alone.

Everything you can do to make your code more understandable boils down to reducing the amount someone needs to hold in mind to understand it. Thinking from this perspective gives a more holistic approach than trying to implement many separate pieces of advice. Holistic approaches have the further advantage of contributing towards consistency to a code base, reducing a little more the different sets of context needed when working with it.

Most coding advice should been seen as a specific example of pandering to the magic seven items, and each use evaluated in terms of the context it removes or introduces when used. Keep each level of code as free from context as possible, and your code will feel leaner, fresher and easier to modify.