by David Travis

4 forgotten principles of usability testing

Over the years, I’ve sat through dozens of usability tests run by design agencies. Clients have asked me to oversee the tests to make sure that the agency really puts their design through its paces. This is a good thing as it shows that usability testing is now becoming a mainstream activity in the design community. But many of the usability tests I’ve sat through have been so poorly designed that it’s difficult to draw any meaningful conclusions from them. No wonder that Fast Company mistakenly believe that user centred design doesn’t work.

Picture a usability test

If I ask you to picture one of these usability tests, you’ll probably conjure an image of a participant behind a one-way mirror, with video cameras and screen recording software. Although your picture would be accurate, there’s something missing: the hand of an experimental psychologist (or an experienced user researcher) checking that other factors are in place behind the scenes. You need this guiding hand because the technology can often obscure the key goals and principles of usability testing. If I wear a white coat and dangle a stethoscope around my neck, it doesn’t make me a medic. Similarly, if I record a picture-in-picture video of someone using a web site, it doesn’t mean I’m running a usability test.

Here are 4 principles of usability testing that have been absent in many of the tests I’ve observed.

Screen for behaviours not demographics.
Test the red routes.
Focus on what people do, not what they say.
Don’t ask users to redesign the interface.

Screen for behaviours not demographics

Participants are often recruited using a demographic screener that focuses on aspects such as gender and age. To understand why this is a mistake, next time you come across a Norman Door — a door with a handle on it that screams ‘Pull!’ but the architect decided to make people push it — take a seat nearby and watch a handful of people use it.

You’ll find that you won’t need to observe many people before you see that there’s a problem with the design of the door. It won’t matter if the people you observe are men or women, young or old, tall or short — virtually everyone will experience this problem. The ones that don’t experience the problem have probably used the door before, and know what to expect.

This short observation tells us a lot about the criteria you should use when writing the perfect participant screener. Trying to balance gender, age and other demographic factors is not just impossible — because of the small sample sizes used in a usability test — but pointless. These factors have a negligible impact on the usability of a system (unless it’s aimed exclusively at women, or seniors).

Instead, recruit users based on their behaviour: people’s previous experience with the domain that you’re testing. If you screen participants based on their demographic characteristics you’ll end up compromising on the kinds of domain knowledge you need. You’ll end up with a demographically representative, but behaviourally biassed, sample.

Test the red routes

Whatever kind of usability test you run — moderated or unmoderated, remote or lab-based — they all share one feature: participants carry out tasks with the system. There are 6 main categories of usability test task but whatever kind you choose you need to make sure they focus on the red routes — tasks that are critical both to the user and the organisation.

In contrast, a scenario like ‘Have a look around the home page and tell me what you think’ — one that I’ve seen used more than once in recent usability tests — is one of a series of usability test tasks you should avoid. The only people who arrive at a home page, look around at the design, and pass judgement on it are other designers. Real people don’t behave that way because real people have specific objectives in mind when visiting a site and it’s those objectives you need to test your site against.

Focus on what people do, not what they say

Lab-based usability tests use a small number of participants: 5 is a common sample size and more than 20 is rare, except in unmoderated, remote usability tests. These small sample sizes work because usability tests focus on cognitive, problem solving behaviours and when it comes to how the brain works, people really aren’t that different from each other.

Problems occur when usability testers shoehorn other research objectives into their usability tests. For example, questions like, ‘How much would you pay for this service?’, ‘What kind of brand values does this site elicit?’ or ‘What are your feelings about this design?’ are totally meaningless when asked in the context of a usability test. You might as well ask people who they will vote for at the next election: you’ll get an answer but the small sample size means it won’t have any predictive value. A more subtle kind of bias occurs when you ask participants to introspect on why they did a particular behaviour. There’s no point asking ‘Why?’ in a usability test because people can’t reliably introspect into their motivations.

Don’t ask users to redesign the interface

When participants struggle with some aspect of the user interface, it’s very tempting to ask them how they would like it designed. But if design was that easy, we could all go home and leave it to participants to develop the next generation of user interfaces. So, after the participant has failed to choose the appropriate button, it’s frustrating to hear the moderator ask, ‘What would have made that button more obvious to you?’ The inevitable answer from the participant is, ‘Make it bigger,’ or even that old chestnut, ‘Make it a different colour.’

Asking people to redesign things introduces solutions into a process that’s designed to find problems. Just when you had them in ‘problem mode’, this line of questioning shifts their thinking. It tends to close down promising investigative threads and opportunities and you then risk not properly understanding the problems. Once users are shown a solution, or are asked to think of one for themselves, they get fixated on it and they struggle to think of anything else. The truth is that participants don’t know what’s possible, and asking them to generate design solutions will make them focus on the blindingly obvious or get them to focus on solutions that may work in one context but may not work in yours.

Conclusion: Avoid cargo cult usability testing

The physicist Richard Feynman once wrote about cargo cult science, where researchers adopt the paraphernalia of doing scientific activity but forget its core principles of empiricism, integrity and avoidance of bias. In the same way, people sometimes adopt the paraphernalia of usability testing, such as the one-way mirror and the video cameras, but forget the core principles of doing user research. Get those core principles right and you can run a great usability test with just a pencil and paper.

Originally published at www.userfocus.co.uk.