How to Run an Unmoderated Remote Usability Test (URUT)

Chris Gray

10 years ago

As UXers we practice in exciting times.

Design is in demand, and the tech sector is at the forefront of business innovation. It is also a time where we have access to a huge number of tools and techniques that enable us to innovate and adapt our practice for a broad range of scenarios.

Usability testing is a cornerstone of UX practice. Perfect for evaluating the designs we create, flexible for collecting a range of information about customers and easy to combine with other techniques. Usability testing is a technique where representative participants undertake tasks on an interface or product. The tasks typically reflect the most common and important activities and participant’s behavior is observed to identify any issues that inhibit task completion.

Usability testing is a super flexible technique that allows for the assessment of a variety of aspects of an interface including the broad product concept, interaction design, visual design, content, labels, calls-to-action, search and information architecture. It is a proven technique for evaluating products, and in some organisations is used as a pre-launch requirement.

It’s relatively time consuming; lab based study is typical completed with between 5 and 12 participants. Assuming each session takes 1 hour, with one facilitator running the sessions this would take between 1 and 3 days.
Recruiting participants to attend the sessions takes time and effort; via a recruitment agency it would take minimum a week to locate people for a round of testing.
Due to the time-intensive nature and cost of in-person usability testing, most studies are conducted with relatively small samples (i.e. less than 10). While a small sample is often adequate for exploring usability and iterating a product, some stakeholders have less confidence in these small sample sizes. This is often due to exposure to quantitative market research where samples in excess of 500 people are common.
They are conducted in an artificial environment. In person tests are often lab-based or in a corporate setting that may not reflect real world use of the product.

One of the ways these downsides can be overcome is the use of unmoderated remote usability test (URUT).

Let’s take a look at some of the basics of running URUTs.

What is URUT?

URUT is a technique that evaluates the usability of an interface or product; that is, the ease of use, efficiency and satisfaction customers have with the interface. It is similar to in-person usability testing however participants complete tasks in their own environment without a facilitator present. The tasks are pre-determined and are presented to the participant via an online testing platform.

There are two broad methods for URUT with varying ways for collecting participant behaviour and these are dictated by the technology platforms.

URUT utilising video recordings of participants interacting with interfaces. These studies are more qualitative in nature with participants thinking aloud during the recording to provide insight.

URUT where the behavior is captured via click-stream data and is run more like a survey. These studies are more quantitative in nature because larger sample sizes are practical and the systems automate tracking of user behaviour.

Both methods are designed to evaluate the usability of a product and both have strengths and weaknesses. Video based sessions require more time to identify the findings and lend themselves to smaller samples however by listening to participants and observing their behavior more information can be collected regarding the design. Click stream methods allow for larger sample sizes and tend to be faster to compete due to the automation of data collection.

Note that some tools support both methods; click stream for large samples and video is collected for a subset of the sample to be able to explore specific aspects of the design in more detail. More on the tools below.

When to use URUT?

Common scenarios where URUT is value include:

Obtaining a large sample and/or a high degree of confidence is required: A small sample of in person usability tests may be all that is required from a design perspective but if your stakeholders are used to seeing large samples and buy-in with a small sample is difficult then using big numbers may be simpler than trying to convince them of the value of the small sample. Further, where a new design is critical for an organisation or will have will have a substantial impact, the confidence gained from a large sample study can be valuable.
Where the audience is geographically dispersed or hard to access: The audience for some products are geographically spread and can be hard to access without travelling great distance, imagine a health case management system for remote communities in the Kimberly. Also consider trying to access time poor senior executives, they may be able to complete a 15 minute online study late at night in a time convenient for them but not during the day or in a specific location.
Where speed is critical: Everyone working in the digital industry will have worked on a project with tight timelines or is running behind schedule. Also, in today’s Agile workplaces, getting usability testing conducted quickly may be the only option. An URUT study can be run in entirety in a couple of days whereas a typical in-person study would take more than one week, if not longer.
Where a specific environment is critical: Some products will be used in environments, which cannot be replicated in a lab or where their context of use is critical. For example, an app used outdoors in snow bound towns.
Where budgets are tight: Running 6 usability testing with a video recording technique especially where the sample is fairly generic, can be inexpensive.
In cases where you need to compare 2 or more products or interfaces: URUT is perfect for benchmarking studies comparing either competitor products or different iterations of your product. The ability to capture large sample sizes means that statistically significant differences between interfaces can be identified.

URUT tends to be less appropriate for more exploratory style usability testing because it is not possible to change tasks mid stream or ask impromptu questions. Click-stream tools tend to provide lots of data on what is happening however tend to provide less insight on why the behavior is occurring. Video based studies can be frustrating when there is a core questions that you would love to ask but hadn’t planned for. For early stage low fidelity prototypes in-person usability testing tends to be preferable because the facilitator can provide more context for participants regarding the intended functionality of the interface.

How to run an URUT

Before you start testing: You need to fully understand why the research is being conducted. Like all UX research techniques this comes back to defining the objectives of the study. All good research requires a clear understanding of:

The objectives of the project.
Identification of the research questions, which spell out how we will explore the objective.

Research Objectives	Research Questions
Evaluate the effectiveness of the booking process	Do participants understand the field labels?
	Do error messages support participants to progress?

Exploring these objectives and research questions with stakeholders at the outset will help with designing the study and provide a reference point for subsequent discussions. Spending the time up front to get this right will save time down the track and help ensure a successful study.

Audience

In order to run an URUT is important to identify who will complete the study. Ideally the sample would be representative of the product audience. There are a number of options for sourcing participants:

Emailing the study to a database of existing customers. This assumes that you have customers.
An intercept can be run on a website with existing customers. That is, a pop-up on your site invites people to participate in the study. An advantage of this approach is that the sample is likely to be representative.
A panel is another option, especially when you don’t have an existing customer base. A panel is a database of people who have indicated that they would like to participate in research. Usually panel databases can be segmented to target a specific audience however you typically pay for the convenience. Some URUT tools have an integrated participant filtering which can be used to improve the representativeness of the sample.
Social media can be another means to locate sample especially for organisations who have an engaged following. With social media it is important to ensure that the sample is representative of your audience.

Offering some form of incentive may be required to motivate participants to compete the study such as gift voucher prize. Audiences that are more engaged with the organization tend to require smaller incentives and those that are less engaged a greater incentive.

Tasks

It is crucial to get the tasks right for URUT. It needs to be very clear to the participants what is required of them. Provide enough detail for the participant to compete the task on their own and try to include any information they would require to complete the task. For example if a task requires credit card details providing fictitious card details will be necessary.

Avoid adding extraneous information in a task, which may confuse participants. Also avoid clues and telling the participant what to do, for example avoid including the wording of a call-to-action in the task, which will give the task away.

And finally, ensure that the interface supports participants to actually complete the task and for them to be aware that they have done so. In a prototype this may require adding specific content. An example task: Imagine you have decided to stay in Cairns for the first week in September. Use this site to reserve accommodation and pay.

Include questions

It is recommended that survey questions be provided as part of a study.

Include closed questions after each individual task to measure ease of task completion. This will provide insight on which tasks are harder to complete than others. Also including open-ended questions will allow participants to describe their experience and any issues they encounter.
Questions can also be provided after the test as a whole, to allow an overall assessment of the experience. This could include metrics such as customer satisfaction with the product, Net Promoter Score and System Usability Scale, which can be used to benchmark the product over time and against competitors. Again open-ended questions should be used to allow participants to provide feedback and to understand why issues are occurring.
Questions can also be included with the intention of profiling participants. These can be helpful to understand the audience and/or to check that the sample matches a known audience.
Finally, questions can be used to understand whether participants have understood a task. This can be especially valuable on content sites. For example if you were testing the Australian Tax Office website, the task could be to find the tax rate for a given salary and then follow up with a question to ask what the rate is.

Test assets

What are you actually testing and how will the URUT tool and participants access the interface? Consider how you are going to set-up the URUT tool and the prototype or interface being tested. The responsiveness of the interface you are testing can impact participant’s experience of the product. It is important to make sure that the participant doesn’t need any set-up from their end; barriers to people completing the study will reduce the completion rates. Try to ensure that the interface can be accessed from any computer or device the participant may be using.

Piloting

Testing the study with either a subset of participants or in a preview mode will allow issues with the prototype, technology, tasks or questions to be ironed out. Piloting the study will protect against wasting sample you are paying for or using up a small limited sample.

Tools

There are a number of different tools out there and more coming onto the market all the time. It is recommend that before running a study you explore some of the different options out there. Tools that support video recordings of participants include:

Tools that track click stream data include:

A tool like User Zoom collects both video and click-stream data.

Field-work

While the survey is being conducted it is important to monitor the data and be available for offering help to participants. Monitoring the data will ensure you see everything is working as planned and that you are receiving the data you need to meet your study objectives. Being available via email or phone helps manage the relationship with customers and to provide help where it is required.

Analysis

Once you have collected your results it is analysis time. To begin with look at some overarching metrics such as overall task completion and customer satisfaction. These can be automatically calculated in tools that measure click-stream like UserZoom. This will provide an overall feel for the effectiveness of the product. For video based tools you will need to watch the sessions and note whether the participants have been able to complete each of the tasks.

With an overall feel for the product look into the individual tasks and identify those that are causing issues. Next you need to find out why. With video based tools, watch video of specific tasks to observe behavior to identify the elements of the interface that are causing the issues. For clickstream services focus in on a combination of the pages visited during the task to identify behavior during the tasks and where the issues have occurred (i.e. which screens). Also view open-ended feedback.

Tips for running URUT

Choose the testing platform after you have identified the objectives of the study. It is crucial to select a tool that is fit for purpose and will support your study objectives. Some platforms do not support specific technologies such as flash and have limitations in the way they measure user behavior. As an example I worked on a study recently that was evaluating a single page app. In order to be able to measure user interaction we needed to get our developers to insert additional code to measure some interaction because the tool tracked the URL which did not change when users navigated a variety of content.

Set clear expectations for participants. Obtaining useful data is dependent on participants understanding what is expected of them. Setting clear expectations up front (during recruitment and at the start of the survey) about what participants are required to do and why the study is being conducted will help ensure success.

Remember that participants won’t receive any assistance during the study. It is crucial to ensure that tasks are clear, user friendly and that help is available. Consider how much assistance is available within the URUT tools for participants during the study.

Avoid bias. While all bias cannot be avoided, it is important to remove as much as possible. Randomise the order of tasks, which means that learning the interface during study will not influence performance on latter tasks. Task wording can also introduce bias. As discussed, pay attention to task wording to ensure that they effectively test the product.

Keep participants engaged: Avoid participants quitting your study. Participants are more likely to complete the study if they feel like their feedback is valuable, if the tasks are interesting and the study isn’t too long.

Case study

A large corporate was about to implement a significant change to their site. Multiple rounds of in-person usability testing had been conducted and indicated that the new design would be a success. Due to the scale of the change the organisation wanted a high degree of confidence that the new design would enhance the experience. We ran a study which involved benchmarking the task completion rates, perceived ease of use and advocacy on the live site. We then repeated these on a prototype of the new design. By utilizing larger sample sizes, we had tight confidence intervals on core metrics that provided an accurate picture of the performance of the new design in comparison to the old.

Wrap-up

URUT is a technique that can offer quick, inexpensive and robust usability testing. Of particular value can be the ability to use the technique for benchmarking and context-sensitive studies. It is a great tool to have in your bag of research techniques and can be a great compliment to in-person methods. Exploring the different tools on offer and experimenting with the technique is the best way to learn and develop expertise.

Make it clear what is expected of participants, keep your research objectives in mind, and avoid bias. Good luck!!