Get people’s feedback on how the justice system works

User testing is one group of methods that can help us understand what people want from services, documents, & tech.

Practically, user testing can guide teams to understand what people can use and how best to support them. Strategically, user testing can help teams to put diverse people’s needs first — and imagine new offerings that better work for them.

A Menu of Early-Stage User Testing

Use these to determine if your new product or service in the justice system has value for its intended users.

You can also explore other evaluation methods — once you have gone from high-level, sketchy ideas to refined, developed implementations of the thing.

Before your organization invests in building a new technology or service, user testing can help you verify if it will be valuable enough to develop. There are multiple methods to use.

Early Stage User Testing, when you are defining your agenda:

  • Priority Sort: have people look at a wide variety of high-level ideas and judge their relative value. They will put them in buckets, spending pretend money on the ideas.
  • Idea book: make concept posters or other high-level presentations of your various ideas or features. Put them in a single book, like a catalog. Have the testers look through and rank which of the ideas they’d like and why.

Development User Testing, when you’re refining a new proposal

  • Feedback Interview: show your new design prototype to a stakeholder and interview them about how usable, useful, and engaging it is.
  • Over-the-shoulder observation: give them the prototype and watch as they try to use it. Note down breakdowns, confusion, and payoffs. You can possibly give them a persona card to help them understand what POV they are using it from.
  • Survey instruments: have the tester fill out a short survey, usually with Likert scale responses of 1-7 (levels of agreement). The questions can draw from surveys around Usability, Design for Dignity, and Procedural Justice.

Pre-Pilot User Testing, before you launch

  • Capability-Improvement Testing: have people use the design prototype, and then after they are done, give them a quiz to measure how much of the important content they have understood and retained — and if their strategy-making and confidence have improved
  • Burden Testing, to see how costly or time-consuming it is to use the innovation.
  • Procedural Justice Testing, to see if it makes people feel the justice system is more fair, efficient, and transparent.

User Testing Guides

Civic User Testing Guide

The Smart Chicago Collaborative has a guide to running usability tests for civic (or public interest) websites, apps, and other technology tools.

It presents a very practical overview of how to operate User Experience and Usabiity testing, especially with diverse community members in public places.

Use this guide to structure recruitment, research protocols, compensation, analysis, and other detailed tasks.

18F Remote Usability Testing Guide

The federal government design & tech group 18F has a series of extensive articles that walk through how civic teams can run usability tests of digital services, especially in a virtual situation.

These articles from 18F include examples, research protocols, ethical considerations, and more guidance on how to run remote user-centered tests well.

Early-Stage User Testing Guide

The Legal Design Lab presents this short book to walk other teams through practical design and testing methods

Our team wrote this short book, User Testing New Ideas, to walk through exactly how we ran user testing for new traffic court-oriented redesigns.  We captured the steps, tools, and ethical considerations we took when doing early-stage testing of new prototypes.

Ethical Guidance for User Testing

This short book from 2017 encapsulates some of the design training that we give to our students before they go into the field to conduct interviews or testing with members of the community.

Early-Stage User-Testing

Games to Rank Ideas & Set Agenda

Use an Idea-Ranking game called a ‘Priority Sort’ to get users’ feedback on early-stage ideas for innovation. You can present brainstormed ideas, to decide which ones should move forward to the next design stage.

You can run this kind of early-stage user research at courts.

In this 2018 illustrated article, “Doing User Research in the Courts on the Future of Access to Justice,” we profile in detail how we ran this early-stage user-testing of ideas that had been proposed to improve court for litigants. We did this in court self-help centers, and detail our methodology, tools, and results. 

Aldunate, G. et al., 2018. Legal Design and Innovation. Available at: 
Also read more at Hagan, Margaret. “Participatory Design for Innovation in Access to Justice.” Daedalus 148, no. 1 (2019): 120–27. .

More on Idea Ranking, Priority Sorting, & Agenda-Setting

Use this game-like evaluation method to sort through lots of high-level ideas. This kind of evaluation can let your team gather a large number of stakeholders’ feedback about which ideas should move forward to the agenda.

You can put the ideas on cards or post-its, and the users can individually rank them — or a working group can together come to a consensus about which of the High-Medium-Low-No value categories each idea should be placed into.

Your team can run ‘Priority Sorts’ or ‘Idea Book Voting’ to have people rank ideas, and sort which should be abandoned and which should be forwarded.

You can give early, sketchy prototypes to target users, and observe if and how people use them, give people usability surveys to rank how user-friendly they are, or run co-design sessions to improve them.

Often we give fictional money to participants, to distribute among different ideas.

An Idea Ranking exercise, with fictional money allocated by users to different prospective innovations.

Quick Review Sheet for Early Ideas

The Idea Review is an initial survey-like tool, that stakeholders can use to quickly evaluate a concept that has been proposed. We tend to use it when there are a handful of ideas in contention for development. This review can be done when groups are presenting their idea at ‘demo’ or ‘idea presentation’ session. Usually, it’s filled out by different experts and users. The team then collects the different ratings and comments to decide which ideas should move forward.

Idea Review sheet

We use the Idea Review at the Legal Design Lab in order to judge ideas as they have emerged out of brainstorms, and after they’ve been sketched out in rough prototypes.

We present this review tool both to subject matter experts and target users, for them to give feedback that we can easily process. Often people have it when listening to a presentation or a demo of a new thing. They can fill in the Idea Review sheet to help us rank whether and how to move forward with it.

It’s a quick tool to get a mix of quantitative and qualitative feedback — and it’s best used when more than one idea is being reviewed.

User Testing During Development

Prototype Review

As one idea is chosen to be the likely new pilot, then there are methods to evaluate the prototypes of this new intervention. This is still in the early, pre-pilot stage — but it helps your group to make a new service much more likely to be engaging, usable, and user-friendly.

As your team makes medium- or high-fidelity mockups of new documents, products, tech, services, or policy, you can invite stakeholders in to review these prototypes’ details to give you critical feedback on them.

Are they worth continuing on? How should they be edited? What must be improved before we invest more resources in them?

Have participants draw on your project plans, rank the different features and options, and give user experience and usability assessments of them.

Usability Protocols

Designer and technology researchers have established protocols for assessing websites, documents, and applications. These usability protocols can help justice teams figure out how usable (or difficult) interventions are. We go over these scales and questions here.

Net Promoter Score (NPS) for an overall UX metric

Justice teams can use the Net Promoter Score (NPS) metric as a very simple, straightforward way to assess the quality of their intervention, from the user’s point of view. How do use the NPS? You ask the user, after they have used the intervention, the question:

How likely are you to recommend this [intervention type, like website/form] to your friend?

Answer on a scale of 0 (low) to 10 (high).

Then the team can count how many Promoters (ratings of 9-10), Detractors (ratings of 0-6), and Passives (I ratings of 7-8) they have. Then get the overall NPS by taking the percentage of Promoters, and subtracting away the percentage of Detractors.

The team will then have the NPS score for their intervention. They can use this to understand, overall, what people’s user experience of the intervention is.

Read more about the NPS from the Nielsen Norman Group.

System Usability Scale 10-question survey

Justice teams can use the popular System Usability Scale (SUS) to ask user-testers about how friendly or difficult their intervention is. The SUS is a set of questions to ask a tester after they have gone through the document, website, tool, or other ‘thing’.

The team would have given the tester their scenario & then given them the intervention to try to use. Then the team should ask them out loud, or have them fill in a paper or online survey with these questions. The answers will be on a Likert scale of 1-5, from Strongly Disagree to Strongly Agree.

1. I think that I would like to use this system frequently.

2. I found the system unnecessarily complex.

3. I thought the system was easy to use.

4. I think that I would need the support of a technical person to be able to use this system.

5. I found the various functions in this system were well integrated.

6. I thought there was too much inconsistency in this system.

7. I would imagine that most people would learn to use this system very quickly.

8. I found the system very cumbersome to use.

9. I felt very confident using the system.

10. I needed to learn a lot of things before I could get going with this system.

Then the team can tally all of the ratings of each question, to get a score for their intervention’s usability.

Single Ease Question task analysis

This protocol focuses on each part of an intervention (rather than the intervention as a whole). The team would ask this question regularly as a person is going through the intervention. It can help the team spot which parts of their workflow is problematic, and which is usable.

The team would ask the user, after different tasks within the intervention, this 1-7 Likert scale question:

Overall, this task was?

On a scale of 1-7, Very Difficult to Very Easy.

The task will still be fresh in the user’s mind, and the team can record the different difficulty levels regularly to compare the tasks’ difficulty.

NASA-TLX: Post-Task Workload

Another task-by-task analysis is from the world of high-stakes, complex products (like in aerospace or the military). This can assess how demanding and difficult individual tasks are in an overall workflow.

The NASA -TLX instrument has 6 questions that are on a 21-point scale from Very Low to Very High. The team asks these questions after each task. Then they also ask the user to weigh which of these questions/categories matter most to them. NASA has an application to use for this complex set of questions and weighing.

Mental Demand: How mentally demanding was this task?
Physical Demand: How physically demanding was the task?
Temporal Demand: How hurried or rushed was the pace of the task?
Performance: How successful were you in accomplishing what you were asked to do?
Effort: How hard did you have to work to accomplish your level of performance?
Frustration: How insecure, discouraged, irritated, stressed, and annoyed were you?

Usability and Dignity Evaluation Instrument

This evaluation protocol combines usability-oriented assessments with changes to someone’s sense of dignity.

When we ask people for short feedback on our new technology offerings, service designs, or information design, we use an evaluation instrument that we’ve created. It’s a short survey evaluation that incorporates assessments from established survey instruments to evaluate software’s usability, to get citizens’ feedback on government services, and to assess people’s sense of procedural justice and dignity while using an offering.

For each of these questions, we use a Likert scale, of 0 (Disagree Strongly) to 7 (Agree Strongly).

I think that I would like to use this system often to help me [insert objective: communicate with the court, navigate court process, etc.]

I thought the [design name] was easy to use.

I felt very confident using the [design name].

This will help me to get through court more efficiently.

This gave me clear, helpful information.

I felt that I was understood using the [design name].  

I wish I could take [design name] around [place/system name] with me.

 I felt the [design name] provided most of the information I was looking for. 

I felt that the [design name] could be improved.

Legal Capability Improvement measurement

Another method to evaluate a prototype is Capability Improvement. This technique can help you determine if your intervention (like a new form, website, or app) might have on a person’s ability to navigate their legal problem and solution.

This technique can help you determine if your intervention can make people more likely to engage with the legal tasks, more informed about the correct info, and more strategic in making choices that are in their best interest.

Most early-stage Capability Improvement tests focus on measuring usability, user experience, and knowledge testing. This means having a small number of people, representative of the target population, use your new intervention (and possibly some other versions). Your testing team will gather both qualitative and quantitative feedback.

Qualitative Information on engagement and capability:

Ask: What did you like?

Ask: What did you find confusing?

Observe what they skipped or ignored. Ask them why did they skip?

Observe what they complain about, or where they express frustration. Ask them, why was this frustrating?

Ask: What made you feel a sense of dignity?

Ask: What make you feel more knowledgeable?

Ask: What about this would make you recommend it to someone else going through this problem?

In addition, you gather Quantitative Information on changes to the testers’ legal capabilities:

Do they fully engage with all of the tasks?

Do they complete the process?

Do they pay attention to all that is being communicated to them (measured by eye-tracking or page-recording)

After they use the intervention, do they answer key Knowledge Questions correctly (in a quiz)?

After they use the intervention, do they have a concrete, and expert-approved strategy of next steps?

How long do they have to spend to understand the information, and answer questions/form strategies correctly?

Quiz-based capability testing

Often Capability Testing is done by giving participants scenarios, and having them try to answer ‘quiz’ questions that test their knowledge and their strategy-making. For example, this is a Legal Capability evaluation from Catrina Denvir in a study of legal education online. She gave recruited participants a fictional scenario, with a ‘persona’ to play. Then she asked them legal knowledge & strategy questions, to determine if a new website intervention improved their ability correctly answer the questions. The quiz questions can help more directly measure the impact of an intervention in improving a person’s legal knowledge — and thus their capability to deal with their justice problem.

User Testing with Persona Task Cards

While testing out a new website, guide, hotline, app, or service, you can use persona task cards to help different team members experience this innovation from a different perspective. You can give your tester a ‘persona card’, so they know whose point of view they are looking at the ranking through. 

Often in very early-stage testing we have people test from a different person’s perspective. We give them ‘persona’s to play, so that they scrutinize the design from these various points of view. We know that they are not as good as having a wide range of people from these different backgrounds, but it is a test-run of this — to see what issues we can spot with a design before investing in wider testing.

Here are some example personas that we give to people:

  • Persona 1: 22-year-old digital native, very confident in technology, prefers to text over phone calls and sometimes even over in-person communication, feels higher confidence in their ability to figure things out especially using Google and looking through social media, but feels relatively out of their depth in the legal system
  • Persona 2: 65-year-old, who is a first-time user of a legal system, but has dealt with lots of other complex social systems like with health insurance, social security, taxes, etc. They are definitely not very confident with technology, but do email a lot, still uses AOL, just moved to the most basic smartphone this year upon the insistence of their kids.
  • Persona 3: 42-year-old who has been to court several times to deal with divorce, custody, and parenting plans. They have had enough repeat visits to feel confident about how to navigate the system and the relationships. They feel literate, but still want support to get things right
  • Persona 4: 31-year-old who has very limited English proficiency. They have been through immigration proceedings with the help of family and friends before, but they definitely don’t feel confident in going to court by themselves because of the language and because of the unfamiliarity of the system. 
  • Persona 5: 18-year-old who is coming with their older family member to help translate for them in court. They are literate in English, and feel confident with technology. But they are not familiar with the legal system at all. They grew up in the US, and feel they can also help with the cultural translation for their family members

More Readings on User Testing

You can also explore other evaluation methods — once you have gone from high-level, sketchy ideas to refined, developed implementations of the thing.

Civic User Testing book

O’Neil, Daniel X, and Smart Chicago Collaborative. Civic User Testing Group as a New Model for UX Testing, Digital Skills Development, and Community Engagement in Civic Tech. Chicago: The CUT Group, 2019,

User Testing 4 Different Prototypes

Hagan, Margaret. “Community Testing 4 Innovations for Traffic Court Justice.” Legal Design and Innovation, 2017. 

Pop-Up User Research in the Courts

Aldunate, Guillermo, Margaret Hagan, Jorge Gabriel Jimenez, Janet Martinez, and Jane Wong. “Doing User Research in the Courts on the Future of Access to Justice.” Legal Design and Innovation. Stanford, CA, July 2018. 

Remote Usability Testing

Maier, Andrew, and Sarah Eckert. “Introduction to Remote Moderated Usability Testing, Part 2: How.” 18F, US General Services Administration agency, November 20, 2018.

18F Toolkit for User-Centered Design

18F. “18F Methods: A Collection of Tools to Bring Human-Centered Design into Your Project.” US General Services Administration, 2020.

Participatory Design of Justice Innovations

Hagan, Margaret. “Participatory Design for Innovation in Access to Justice.” Daedalus 148, no. 1 (2019): 120–27. .

Abstract: Most access-to-justice technologies are designed by lawyers and reflect lawyers’ perspectives on what people need. Most of these technologies do not fulfill their promise because the people they are designed to serve do not use them. Participatory design, which was developed in Scandinavia as a process for creating better software, brings end-users and other stakeholders into the design process to help decide what problems need to be solved and how. Work at the Stanford Legal Design Lab highlights new insights about what tools can provide the assistance that people actually need, and about where and how they are likely to access and use those tools. These participatory design models lead to more effective innovation and greater community engagement with courts and the legal system.

Doing Human-Centered Design with Courts

Hagan, M.D., 2018. “A Human-Centered Design Approach to Access to Justice: Generating New Prototypes and Hypotheses for Intervention to Make Courts User-Friendly.” Indiana Journal of Law and Social Equality, 6(2), pp.199–239. 

Abstract: How can the court system be made more navigable and comprehensible to unrepresented laypeople trying to use it to solve their family, housing, debt, employment, or other life problems? This Article chronicles human-centered design work to generate solutions to this fundamental challenge of access to justice. It presents a new methodology: human-centered design research that can identify key opportunity areas for interventions, user requirements for interventions, and a shortlist of vetted ideas for interventions. This research presents both the methodology and these “design deliverables” based on work with California state courts’ Self Help Centers. It identifies seven key areas for courts to improve their usability, and, in each area, proposes a range of new interventions that emerged from the class’s design work. This research lays the groundwork for pilots and randomized control trials, with its proposed hypotheses and prototypes for new interventions, that can be piloted, evaluated, and — ideally — have a practical effect on how comprehensible, navigable, and efficient the civil court system is.

Legal Help Design Testing & Strategies

The Michigan Advocacy Project & Graphic Advocacy Project have a guide to using legal design strategies to improve your legal technology efforts.

These include guides for user testing & principles about how to make legal help that works for many different types of people.

MargaretUser Testing