More numbers on adverts vs. content

The New York Times has more information on the often awful balance between editorial content and adverts in The Cost of Mobile Ads on 50 News Websites.

The difference was easy to spot: many websites loaded faster and felt easier to use. Data is also expensive. We estimated that on an average American cell data plan, each megabyte downloaded over a cell network costs about a penny. Visiting the home page of every day for a month would cost the equivalent of about $9.50 in data usage just for the ads.

Content blocking

About three weeks ago I gave in: I turned on Firefox's Tracking Protection feature. Last week I installed 1Blocker on my iPhone. Until now, I'd avoided ad- and tracker-blocking software. I felt uncomfortable hiding that which provided sites' revenues. Looking under the hood at sites I regularly visit, however, I realise now that I've been a fool to hold out for so long.

I have two aims with both Tracking Protection and 1Blocker:

  1. Avoid being tracked online. Almost always this involves blocking adverts. In addition, it covers Facebook/Twitter/Google Like-type buttons.
  2. To simply make the web faster.

The main argument put forward to against using blocking software, and the reason I've held out for so long, is that as a by-product of improving your web experience, you remove revenue from the sites you visit when you block the trackers and the adverts which, purportedly, fund them.

The story goes: by reading a webpage, you've agreed to the means the publisher has decided to use to derive money from that page. I've come to see that this is specious: the publisher's advertising and tracking software has executed long before I've read the article or even considered whether it's worth providing my data in order to enjoy it. If I have no choice, there is no agreement. Implied consent by visiting a URL isn't valid: it's not legal to make me agree to a contract whose content I have not been shown.

The first reason leads directly and obviously to the second. The images, tracking pixels, JavaScript files, movies and other cruft employed to advertise and track have swelled to epic proportions.

The savings speak for themselves. Here are the number of requests and amount of data downloaded for two popular news sites, with Tracking Protection turned on and off:

SiteHTML sizeTP OnTP OffRequests savedData saved
Economist84kB 150 requests; 6MB580 requests; 13MB74%53%
Guardian81kB 29 requests; 1.6MB191 requests; 6MB85%73%

In addition, with Tracking Protection turned off, both sites send ten or so requests per minute to various tracking services, presumably to gauge "engagement with content".

On a wifi connection on a laptop, the extra work doesn't matter too much. On a phone, however, the extra data and energy used is significant. It's worth fighting against to gain longer battery life and more useful data from my tiny 500MB data plan.

Many pundits have suggested that Apple has some ulterior motive to "kill Google" or other such rubbish. To me it's clear that content blocking is aimed at improving battery life and increasing user satisfaction of iPhones and iPads. I suspect Apple would have shipped blocking by default but for the need to avoid being pilloried by publishers and advertisers.

Long ago publishers made a Faustian bargain with advertisers. Unlike Faust, the rewards ended up being fairly scant. Now both advertisers and publishers are now reaping their rewards: this has gone too far and we readers fight back the only way we can.

I used to avoid ad-blocking software. I have changed my mind. I now realise the implied agreement readers made with publishers was torn up by publishers and advertisers long ago, if it ever existed at all. For readers, there are few cons to blocking tracking. Let the publishers and their advertisers figure their own way out of this tarpit of their own making.

I liken this to the publishers' thoughts of agreement above. If in their view I imply consent to advertising when I visit a page, I will view their sending data to me as implied consent that they are happy for me to block their tracking, advertising and other ne'er-do-wells.

I suggest you do too.

When and how to use a builder in Java

A few weeks ago, we switched an API in our synchronizing database for Android to use the builder pattern. You can see the implementation. The long and short is that our external API went from:

PullReplication pull = new PullReplication();
pull.source = /* remote database URL */; = /* local database */;
Replicator pullReplicator = ReplicatorFactory.oneway(pull);

to the cleaner:

Replicator pullReplicator = ReplicatorBuilder.pull()
    .from(/* remote database URL */)
    .to(/* local database */)

The primary gain was that we reduced the API's "surface area" significantly. We did this by going from having three classes whose names vaguely suggested their combined usage -- PullReplication, ReplicatorFactory and Replicator -- to two classes whose names spell out a clear relationship: ReplicatorBuilder and Replicator.

While builder is an established pattern, it's one that's novel in our codebase; we were reminded it existed and could make some things much easier to use. After this success, the pattern started popping up in a number of new pull requests as we experimented with the pattern.

After a while, it became clear that often things were improved, but in other places we'd really been using the pattern for its own novelty. As we debated our use of the pattern, we found that we could use a few rules of thumb in order to quickly decide which to use. We read the Gang of Four Design Patterns book, and it was helpful, but we found it's rules a little abstruse. So we looked at the examples in our PRs and tried to draw out some patterns from the empirical evidence.

Our first observation was to recall that the builder pattern is designed to avoid long, unreadable method calls. For example, let's take a client object, used to access CouchDB. There are a number of options you can set beyond just access details, to do with the underlying connection and so on. So a constructor might end up looking like this:

Client c = new Client("http://localhost:5984", "mike", "secret", 
    10, true, 6, false);

To apply the builder pattern to our Client class, first we create a builder() method which takes the required options. For optional settings, we apply the builder pattern of setting the option and returning the builder itself, ready for chaining. Finally the user calls build() to indicate they're complete.

Having the builder() method take the required options is is a nice way of enforcing required options within the API itself rather than throwing an exception at the time build() is called. However, obviously this is only usable for one or two settings, otherwise we end up with another huge method! In that case, throwing an exception within build() is the best approach. A RuntimeException is appropriate here, we felt.

At first glance, applying the builder pattern does improve our API a lot.

Client c = Client.builder("http://localhost:5984")
    .credentials("mike", "secret")

Before saying we're done here, however, we remembered that there are other, more standard, ways to reduce the need for long and puzzling method calls. This brought us to our second realisation, which was that some objects require many settings to be set before they can be used, while for others most are optional.

For our client, most of the settings are optional; the client object can supply good defaults for most. Therefore the best way to remove the long constructors is simply to do it. We can rely on setters for the optional configuration. This means that both ourselves and the developer only have a single class to write, maintain or use, without compromising code readability.

The common case is now simpler and more guessable than the builder can make it. We provided two, one with just the URL and a second convenience constructor with a username and password.

Client c = new Client("http://localhost:5984");
Client c = new Client("http://localhost:5984", "mike", "secret");

And the complicated case is still very easy to follow:

Client c = new Client("http://localhost:5984", "mike", "secret");

There is an obvious caveat to the use of setters: immutable objects don't have setters, and instead require all settings -- required or optional -- to be set when the object is constructed. While we can provide convenience methods for the common cases which provide the sensible defaults, a huge method must be lurking somewhere in the constructor chain. This also implies that we, as designers of the API, must be able to know in advance what convenience methods should be provided; often this is difficult or impossible.

Therefore, we figured that for immutable objects with more than two or three options, the builder pattern should be used in our codebases. While the builder may have to call into that cryptic constructor, our users can be spared it.

We ended up using this in our View API. A view request requires two options in the simplest case, but there are more than ten optional settings. Again, this example shows the pattern of providing the required settings in the newRequest() method to help the developer using the API setup a valid object. It could be argued that the arguments are a bit opaque, but we considered it worthwhile for the benefits that generics brought when strongly typing responses.

ViewRequest<String, String> viewRequest = 
        viewBuilder.newRequest(String.class, String.class)

Our rules of thumb therefore ended up being split between mutable and immutable objects, then by the number of possible options:

  1. If the object is mutable:
    1. If only a few values must be set before the object is usable, use a constructor + setters approach.
    2. If many options must be setup before using the object, use a builder approach to avoid long, cryptic constructors.
  2. If the object is immutable:
    1. If it only has a few options in total, use a constructor.
    2. If it has many options -- regardless of the number of options which must be set before the object is used -- use a builder.

We didn't find much guidance online, so hopefully these might help someone. Do file bugs if you disagree with our choices, or drop me an email directly.

Face-to-face communication as a crutch

A lot of times people talk about how face-to-face communication is high bandwidth, but let’s just say that in a lot of cases, that face-to-face communication can be a crutch. You can just throw bandwidth at the problem as opposed to actually using the bandwidth you have efficiently.

A thought-provoking point from an interview with Joe Mastey on the FogBugz blog.

Wi-Fi Sense seems intrusive

Microsoft's Wi-Fi Sense appears a bit scary for anyone running a wi-fi network. Once a user has joined your network and not opted out of sharing it, the network and its access details are sent to Microsoft for use by everyone in that user's contact list:

For networks you choose to share access [to all your Outlook, Skype or Facebook contact list] to, the password is sent over an encrypted connection and stored in an encrypted file on a Microsoft server, and then sent over a secure connection to your contacts' phone if they use Wi-Fi Sense and they're in range of the Wi-Fi network you shared. Your contacts don't get to see your password, and you don't get to see theirs.

To opt-out of this, you either:

Am I over-reacting, or is this feature pretty odd? Even for users, it seems connecting to potentially hostile wi-fi networks automatically is a dangerous thing.