Interacting via the Keyboard, and why it may be Firefox’s Biggest Differentiator

There are many methods Firefox could use to improve how you interact with the program using the keyboard. This is a major way Firefox should differentiate itself from the competition and at the same time introduce a significant improvement to the web browsing experience. Combining a GUI with text-driven interaction is an incredibly powerful way to help people complete their day-to-day tasks quicker and more easily.

Evolution has perfected language over thousands of years, an amazing way to abstractly describe objects and the actions we wish to do on them. Engineers have come up with the keyboard, an amazing way of entering language into a computer. Why cannot we use the keyboard to enter language into a computer describing what we wish the computer to do?

The general case is hard, but there are many simple things we can improve by applying this principle. The command line is the obvious initial, and successful, application. The dividing line between GUI and command line needs to be blurred, in order to create the next generation of user interfaces.

As an experiment, I am trying to get as much done as possible without the mouse on my new work-provided laptop. Given everything on a computer has a name, my first goal is to be able to access all programs and files using the keyboard alone — surely typing a name is faster than pushing a mouse all over my desk?

On my MacBook, I use a program called QuickSilver almost exclusively to open programs. QuickSilver has far more power than a simple program launcher, but for now this is mostly how I use it. On Windows, I have found a program called Launchy to accomplish the same task. Soon I should be able to achieve my first sub-goal of never using the mouse to navigate the Start Menu to find a program.

If you don’t know where you put a file, navigating the folder hierarchy with the mouse takes on a very painful aspect. Finding a needle in a haystack of folders and files isn’t simple. The keyboard allows you to use the full semantic content of a file with far more lucidity than a mouse ever can (for files with textual information, obviously). A folder hierarchy forces you to put a file in one place, but with the keyboard you are allowed as many places as you could ever need: access the same file using “budget”, “2007 accounts”, “customer records” and any other information the file contains.

As an example of where this paradigm is taking hold, the makers of desktop search know keyboard interaction is key. All flavours of desktop search by default provide hotkeys to access the search. They recognise that you should not have to use the mouse to access the search interface. Search is a keyboard interaction, and you shouldn’t need to leave the keyboard to complete your search. Providing effective ways to narrow down and select results using the keyboard is a significant way some desktop search tools could improve.

As a second example, consider a full, QuickSilver-like interface for Word. This would be incredible. Imagine being able to type “insert table” or “set style heading 2” rather than reaching for the mouse or breaking your flow to remember an awkward hotkey. Retain the mouse-based interface — if you don’t know something’s name, the mouse offers great ways to find a function — but provide this method for anyone who uses Word more than occasionally. Forcing repeated mouse-usage in an application based around entering text is fundamentally flawed.

Back to Firefox. It should be clear why the keyboard-based interactions mentioned in the linked article interest me. Why they are important generally, rather than of interest only in my quest, is because they try to meld the output capabilities of a GUI with the input capabilities of the keyboard and command line. Firefox provides a set of ways to interact with the web using only the keyboard, but they are not terribly pleasant, intuitive or discoverable to the casual user.

Both QuickSilver and Launchy can tie into your web browser to allow you to type either a URL or the name of a bookmark to get to a web site. This is a far more direct way to achieve the objective of visiting a website. The computer is taking on the task of knowing how to get to the information wanted, rather than the user having to do that work. Concretely, I can type “gmail” into QuickSilver, rather than thinking I need to find and load a web browser before navigating to the site.

Firefox itself, however, can provide a much richer interface than any external application could provide. It can tie the keyboard into the core interaction model of the application. Current keyboard navigation within pages and the application is serviceable but rudimentary. If you use the keyboard, you get very little visual feedback to help you and often you have to fall back on hotkeys to do things. A simple example is providing much better feedback when using the keyboard to navigate through links on a page. Make the currently selected link obvious, and provide a list of other possible matches to aid the user get to the one they want.

To interject briefly, hotkeys give the keyboard a bad name. Whilst hotkeys are undoubtedly efficient, they require investment to learn, and are somewhat opaque to many users. Ctrl-t? Why not allow the user to type “new tab”? Allowing interaction via the keyboard in a natural way is powerful. We should put technology currently going into speech control of computers into keyboard control. This would allow us to provide far superior models of keyboard interaction than programs currently have.

So, Firefox should use deep integration and advanced technologies to provide a superior and qualitatively different feel to their web browser by providing rich keyboard interaction. This would surely provide a great incentive to switch to Firefox, especially for power users, as it would be a visible and obvious differentiator. The web is text based, why don’t our web browsers provide ways for us to use this text for a more efficient and pleasurable experience?

Applications like QuickSilver and Launchy are showing how the power and efficiency of the command line interface can be combined with the intuitive but slow nature of interaction with a graphical user interface. Further development of this interaction pattern across the desktop will introduce a significant new, less painful way to use applications. Firefox should lead this charge, and they obviously have good ideas as to how they could.

← Older
Universal's Big Mistake, Thumbing Their Nose to Apple
→ Newer
Some Hard Data on Code Readability