When and how to use a builder in Java

A few weeks ago, we switched an API in our synchronizing database for Android to use the builder pattern. You can see the implementation. The long and short is that our external API went from:

PullReplication pull = new PullReplication();
pull.source = /* remote database URL */;
pull.target = /* local database */;
Replicator pullReplicator = ReplicatorFactory.oneway(pull);

to the cleaner:

Replicator pullReplicator = ReplicatorBuilder.pull()
    .from(/* remote database URL */)
    .to(/* local database */)
    .build();

The primary gain was that we reduced the API’s “surface area” significantly. We did this by going from having three classes whose names vaguely suggested their combined usage – PullReplication, ReplicatorFactory and Replicator – to two classes whose names spell out a clear relationship: ReplicatorBuilder and Replicator.

While builder is an established pattern, it’s one that’s novel in our codebase; we were reminded it existed and could make some things much easier to use. After this success, the pattern started popping up in a number of new pull requests as we experimented with the pattern.

After a while, it became clear that often things were improved, but in other places we’d really been using the pattern for its own novelty. As we debated our use of the pattern, we found that we could use a few rules of thumb in order to quickly decide which to use. We read the Gang of Four Design Patterns book, and it was helpful, but we found it’s rules a little abstruse. So we looked at the examples in our PRs and tried to draw out some patterns from the empirical evidence.

Our first observation was to recall that the builder pattern is designed to avoid long, unreadable method calls. For example, let’s take a client object, used to access CouchDB. There are a number of options you can set beyond just access details, to do with the underlying connection and so on. So a constructor might end up looking like this:

Client c = new Client("http://localhost:5984", "mike", "secret", 
    10, true, 6, false);

To apply the builder pattern to our Client class, first we create a builder() method which takes the required options. For optional settings, we apply the builder pattern of setting the option and returning the builder itself, ready for chaining. Finally the user calls build() to indicate they’re complete.

Having the builder() method take the required options is is a nice way of enforcing required options within the API itself rather than throwing an exception at the time build() is called. However, obviously this is only usable for one or two settings, otherwise we end up with another huge method! In that case, throwing an exception within build() is the best approach. A RuntimeException is appropriate here, we felt.

At first glance, applying the builder pattern does improve our API a lot.

Client c = Client.builder("http://localhost:5984")
    .credentials("mike", "secret")
    .socketTimeout(10)
    .login(true)
    .maxConnectionsPerHost(6)
    .dogsAreBetterThanCats(false)
    .build();

Before saying we’re done here, however, we remembered that there are other, more standard, ways to reduce the need for long and puzzling method calls. This brought us to our second realisation, which was that some objects require many settings to be set before they can be used, while for others most are optional.

For our client, most of the settings are optional; the client object can supply good defaults for most. Therefore the best way to remove the long constructors is simply to do it. We can rely on setters for the optional configuration. This means that both ourselves and the developer only have a single class to write, maintain or use, without compromising code readability.

The common case is now simpler and more guessable than the builder can make it. We provided two, one with just the URL and a second convenience constructor with a username and password.

Client c = new Client("http://localhost:5984");
Client c = new Client("http://localhost:5984", "mike", "secret");

And the complicated case is still very easy to follow:

Client c = new Client("http://localhost:5984", "mike", "secret");
c.setSocketTimeout(10);
c.doLogin(true);
c.setMaxConnectionsPerHost(6);
c.setDogsAreBetterThanCats(false);

There is an obvious caveat to the use of setters: immutable objects don’t have setters, and instead require all settings – required or optional – to be set when the object is constructed. While we can provide convenience methods for the common cases which provide the sensible defaults, a huge method must be lurking somewhere in the constructor chain. This also implies that we, as designers of the API, must be able to know in advance what convenience methods should be provided; often this is difficult or impossible.

Therefore, we figured that for immutable objects with more than two or three options, the builder pattern should be used in our codebases. While the builder may have to call into that cryptic constructor, our users can be spared it.

We ended up using this in our View API. A view request requires two options in the simplest case, but there are more than ten optional settings. Again, this example shows the pattern of providing the required settings in the newRequest() method to help the developer using the API setup a valid object. It could be argued that the arguments are a bit opaque, but we considered it worthwhile for the benefits that generics brought when strongly typing responses.

ViewRequest<String, String> viewRequest = 
        viewBuilder.newRequest(String.class, String.class)
    .skip(1l)
    .limit(25l)
    .startKey("dog")
    .startKeyDocId("fido")
    .endKey("elephant")
    .endKeyDocId("dumbo")
    .descending(false)
    .group(true)
    .groupLevel(3l)
    .includeDocs(true)
    .inclusiveEnd(false)
    .reduce(true)
    .buildRequest();

Our rules of thumb therefore ended up being split between mutable and immutable objects, then by the number of possible options:

If the object is mutable:
1. If only a few values must be set before the object is usable, use a constructor + setters approach.
2. If many options must be setup before using the object, use a builder approach to avoid long, cryptic constructors.
If the object is immutable:
1. If it only has a few options in total, use a constructor.
2. If it has many options – regardless of the number of options which must be set before the object is used – use a builder.

We didn’t find much guidance online, so hopefully these might help someone. Do file bugs if you disagree with our choices, or drop me an email directly.