Evaluation Methods of Justice Innovations

What works in increasing access to justice?

Evaluation methods can help us create more effective legal help interventions. And they can make sure that the intervention is working as intended.

On this page, you can find resources, experiments, and case studies on user testing, pilot evaluation, and other outcomes research on justice innovations.

Four Areas for Justice Evaluation

Use these evaluation methods in the design process & then for services as they are piloted and scaled.

  1. Early-Stage Evaluation during the Design of a Legal Help Service: How do we know what kind of justice innovation is needed? Does our new idea for legal help work? How can we make the strongest version of a new idea?
  2. Pilot Evaluation during the early Deployment of a Legal Help Service: Is this thing working as intended? What wrong assumptions or choices were made in the design, that need to be fixed? What bug or performance issues must be improved? Does it increase people’s access to justice, on key indicators?
  3. Evaluation of an Established Service/Tool: Even if a clinic, website, policy, or other ‘intervention’ is well-established, it is still worth gathering feedback about whether it is making an impact & what ideas there are for improving it.
  4. Ongoing Feedback: Apart from particular interventions, justice institutions can have regular feedback and data-gathering from their clients and peers. This ongoing data can help get organizations information about what is changing, what is needed, and ideas for improvement.

Please explore these resources to find better ways to develop promising justice interventions, and to gather constant feedback on impact in order to improve the system.

Early Stage Evaluation

Methods for Early Stage Evaluation

What are the methods we can use to understand if our new ideas, rough prototypes, or proposals are feasible, viable, and desirable for stakeholders and the system?

Here is a quick overview of early-stage User Feedback tools. You can run these activities with your team, or in controlled ‘lab’ situations. These methods allow you to rank ideas, choose which have the most potential, and decide which ideas you take to pilot.

These early stage evaluation methods allow for quick, affordable ways to screen ideas, and to refine them to be more likely to work with the target users.

For example, you can run ‘Priority Sorts’ or ‘Idea Book Voting’ to have people rank ideas, and sort which should be abandoned and which should be forwarded. You can give brought prototypes to target users, and observe if and how people use them, give people usability surveys to rank how user-friendly they are, or run co-design sessions to improve them.

Please see a fuller write-up on our User Testing methods page.



UX Heuristic Review - Idea Review sheet for wise design smaller


There is real value in costing out/comparing the (Administrative) Burden Costs of a given form, task, or other process step in a person’s legal journal.

Many federal agencies are required to do this Burden Cost evaluation whenever they make changes to procedures involved in users’ interactions with  social security, taxes, or disability processes, or who can access food stamps. It’s required at the federal level by the Paperwork Reduction Act, which requires an agency to measure the effect of a new procedure change by looking at:

  • The time it takes for an average person to fill in the given form or do the required task
  • The time it takes to prepare the documents or get the information to correctly fill in the form/do the task
  • Assume that this costs a person $15/hour 
  • Calculate the number of people who will have to go through this on average

This basic calculation will allow you to produce a numeric amount of how much this form or step costs ‘the public’ :

( (Time to fill + prep)*$15/hour) )*# of people doing this = Burden Cost

Having Burden Costs — or comparing them across different proposed forms — can be a very influential way to pressure policy-makers or support an argument around process simplification. In the access to justice space, you could be do Burden Cost calculations that include:

  • Time to search for and find the correct form
  • Time to look up and understand the words being used in the form
    • (Optional: time to get help at Self Help Center, including waiting in line and being seen)
    • (Optional: time to call legal aid, get screened, see if they can help you, be helped)
  • Time to read the form and fill in the questions
  • Time to prep/make copies/ get ready for filing
  • Time to file it in the courthouse
  • Time to deal with any problems with filing

You can calculate these time costs by doing these steps yourselves, or having research participants do some/all of these. It could also be done by gathering data from experts with data or informed estimates of these timings.


Benchmarking is a key evaluation and research technique, especially for a discrete work product like a a new Form or Info Sheet your team is creating. In benchmarking, you will be comparing your intervention to established principles + criteria from past design work, (or by investing in establishing these yourselves, by looking at others’ practices, if there do not yet exist benchmark standards).
For those working on legal communications, for example, there are some Benchmark Criteria from other groups working on parallel efforts to forms.  
The first is from the Simplification Centre, which works on simplifying government documents generally. The second is from the UK’s Behavioural Insights Team, from their massive study of improving privacy policy documents.
You can compare your new legal communication design (like your FAQ, Info Sheet, Letter, or Form) to these established principles, make tweaks as might become clear — and then communicate that to the Judicial Council — that you are following established benchmark standards.
(See full 16 criteria from Simplification Centre and explanation here, and see more details on the Behavioral Insights Team’s experiments/goals at https://www.bi.team/blogs/terms-conditions-apply/)


Another method to evaluate an early prototype is Capability Improvement. This technique can help you determine if your intervention (like a new form, website, or app) might have on a person’s ability to navigate their legal problem and solution.

This technique can help you determine if your intervention can make people more likely to engage with the legal tasks, more informed about the correct info, and more strategic in making choices that are in their best interest.

Most early-stage Capability Improvement tests focus on measuring usability, user experience, and knowledge-testing. This means having a small number of people, representative of the target population, use your new intervention (and possibly some other versions). Your testing team will gather

Qualitative Information on engagement and capability:

  • What they like,
  • What they find confusing,
  • What they skip or ignore,
  • What they complain about
  • What they say improves their sense of dignity, knowledge, or likelihood to use thing/recommend it to friends

In addition, you gather Quantitative Information on changes to the testers’ legal capabilities:

  • Do they fully engage with all of the tasks?
  • Do they complete the process?
  • Do they pay attention to all that is being communicated to them (measured by eye-tracking or page-recording)
  • After they use the intervention, do they answer key Knowledge Questions correctly (in a quiz)?
  • After they use the intervention, do they have a concrete, and expert-approved strategy of next steps?
  • How long do they have to spend to understand the information, and answer questions/form strategies correctly?

Often Capability Testing is done by giving participants scenarios, and having them try to answer ‘quiz’ questions that test their knowledge and their strategy-making. For example, this is a Legal Capability evaluation  from Catrina Denvir in a study of legal education online. She gave recruited participants a fictional scenario, with a ‘persona’ to play. Then she asked them legal knowledge & strategy questions, to determine if a new website intervention improved their ability correctly answer the questions. The quiz questions can help more directly measure the impact of an intervention in improving a person’s legal knowledge — and thus their capability to deal with their justice problem.



Evaluations of the Intervention in the Field

This free training book from the World Bank presents strategies and tools to evaluate programs in the field.

In addition, this set of resources from the UK Government, called the ‘Magenta Book’ and connected slide-decks and appendices, goes through how to evaluate policies in practice.


User Feedback machine from an airport

MargaretEvaluation Methods of Justice Innovations