Categories
Current Projects Eviction Innovation

Data to Advance Access to Justice Efforts Around the Country

Lessons from the Eviction Diversion Initiative for Coordinated Data & Outcomes Measurement

Eviction diversion programs aim to prevent unnecessary displacement by connecting tenants and landlords with referrals, resources, and resolution options such as mediation. As these programs expand, evaluating their effectiveness is crucial for improving services, influencing policy, and ensuring they meet the needs of vulnerable communities.

At a recent working group session of the Access to Justice Network/Self-Represented Litigation Network on research for access to justice, Neil Steinkamp of Stout led a discussion on strategies for measuring program impact, refining data collection, and translating insights into policy action. Neil has led the evaluation & impact assessment of the Eviction Diversion Initiative (EDI). The National Center for State Courts is leading a cohort of state courts in the EDI to build, refine, and evaluate new eviction diversion efforts around the country and recently released an interim evaluation report on the EDI.

Below are key takeaways from the conversation about Neil’s learnings in building a multi-jurisdiction, standardized (but customizable) evaluation framework, to gather similar data, collect it, and make it useful to stakeholders in many different jurisdictions.

Building a Framework for Evaluation

The primary goal of evaluating eviction diversion programs is often to understand who is being served, what their experiences are, and how well programs link them to resources. Instead of starting with rigid hypotheses, evaluators should approach this work with curiosity and open-ended questions:

  • What do we need to learn about the impact of eviction diversion?
  • What data is both useful and feasible to collect?
  • How can data collection evolve over time to stay relevant?

A flexible, iterative approach ensures that evaluation remains meaningful as conditions change. Some data points may become less useful, while new insights emerge that reshape how success is measured.

Balancing Consistency & Flexibility in Data Collection

Data consistency across jurisdictions is essential for comparisons, trend analysis, and deeper conversations on outcomes. However, differences in court structures, available resources, and local policies mean a one-size-fits-all approach won’t work.

A practical balance found in the EDI was 80% standardized questions for cross-jurisdictional alignment, with 20% tailored to local needs. This allows for:

  • Identifying national trends in eviction prevention.
  • Accounting for regional differences in housing, financial aid, and social service access.
  • Enabling courts and service providers to track their unique challenges and successes.

The key is avoiding overburdening staff or participants with excessive data collection while still capturing essential insights that can help demonstrate impact, identify opportunities for improvement, and assist in advocating for sustainable funding.

Tools & Methods: Making Data Collection Work for Courts

Many courts lack the flexibility to modify their existing case management systems. To address this, evaluators used separate, simple tools like Microsoft Forms and Microsoft Excel Online to streamline intake and reporting, reducing staff workload.

To make sense of the collected data, they developed visual dashboards that:

  • Simplify complex datasets.
  • Enable courts and policymakers to track progress in real time.
  • Highlight gaps in services or emerging trends in housing instability.

By using easy-to-implement tools, courts were able to enhance efficiency without major technology overhauls.

Voluntary Data Collection & Trauma-Informed Approaches

All demographic and personal data collected in the evaluation was voluntary, and most participants were willing to share information. However, a trauma-informed approach recognizes that some individuals may feel uncomfortable disclosing details, especially in high-stress legal situations.

Evaluators emphasized the importance of creating a safe, respectful data collection process — one that builds trust and ensures that participation does not feel coercive or invasive.

Key Insights from the Evaluation for Eviction Prevention Policymaking

Data collected through the eviction diversion pilot programs revealed critical insights into eviction patterns and tenant needs:

  • Eviction is often preventable with modest financial support — Small rental arrears, often less than $1,000, are a key driver of many evictions.
  • Affordability crises go beyond rent payments — Many tenants face financial instability due to job loss, medical issues, or family obligations.
  • A mix of services is essential — Rental assistance alone is not always enough; tenants benefit significantly from legal aid, mediation, and access to other social services.

One major takeaway is that data should be treated as a living resource — each year, evaluators should reassess what they are tracking to ensure they are capturing the most relevant and impactful information.

From Data to Policy: The Power of Evidence-Driven Decisions

Coordinated eviction diversion data plays a powerful role in shaping policy and influencing resource allocation. Some key ways data has been used in practice include:

  • Informing state and local housing policies — Some jurisdictions used data insights to refine funding strategies for eviction prevention programs.
  • Stakeholder engagement — Some jurisdictions used data to inform effective dialogue among community stakeholders including the landlord community, tenant advocates, agencies in the continuum of care, and other local stakeholders.
  • Strengthening legal and mediation services — Data demonstrated that legal aid and mediation can be as crucial as rental assistance, leading to investments in expanded legal support.
  • Improving landlord-tenant relationships — Greater transparency about rental arrears and eviction patterns has helped courts and service providers create more effective intervention strategies.

While policymakers often focus on one intervention type, such as rental assistance, evaluators advocate for a holistic “AND” approach — combining legal support, financial aid, and mediation to achieve the best outcomes.

What’s Next? Refining the Future of Eviction Diversion

Looking ahead, the focus will be on:

  • Refining data collection practices — Enhancing the consistency and efficiency of data gathering tools.
  • Maintaining adaptability — Regularly reassessing what data matters most.
  • Encouraging a comprehensive approach to eviction prevention — Strengthening connections between legal and social support services.
  • Using data to inform and advocate for policy changes — Ensuring decision-makers understand that eviction diversion is more than just financial aid — it’s a system of interventions that must work together.

By grounding policy and program design in real-world data, eviction diversion efforts can be more effective, equitable, and responsive to community needs.

As this work continues, stakeholders across courts, legal aid, housing advocacy, and the landlord community must keep asking:

What do we need to learn next? And how can we use that knowledge to prevent unnecessary evictions?

Expanding the Impact: How Federated Data Collection Can Benefit Legal Services

Beyond eviction diversion programs, federated, standardized data collection has the potential to transform other areas of legal services, such as court help centers, brief advice clinics, and nonprofit legal aid organizations. These groups often work in silos, collecting data in ways that are inconsistent across jurisdictions, making it difficult to compare outcomes, identify systemic issues, or advocate for policy changes.

By adopting a shared framework for data collection — where core metrics are standardized but allow for local customization — legal service providers could gain richer insights into who they are serving, what legal issues are most pressing, and which interventions lead to the best outcomes. For example, help centers could track common questions and barriers to accessing services, while brief advice clinics could measure the impact of legal guidance on case outcomes.

This type of data-driven coordination could help funders, policymakers, and service providers make smarter investments, target resources more effectively, and ultimately improve access to justice at scale.

Join the Access to Justice Network to learn more about projects like this one. The Network is a community of justice professionals that share best practices & spread innovations across jurisdictions.

See a webinar of Neil and his colleagues present more on evaluating the Eviction Diversion Initiative here.

Categories
AI + Access to Justice Current Projects

Measuring What Matters: A Quality Rubric for Legal AI Answers

by Margaret Hagan, Executive Director of the Legal Design Lab

Measuring What Matters: A Quality Rubric for Legal AI Answers

As more people turn to AI for legal advice, a pressing issue emerges: How do we know whether AI-generated legal answers are actually helpful? While legal professionals and regulators may have instincts about good and bad answers, there has been no clear, standardized way to evaluate AI’s performance in this space — until now.

What makes a good answer on a chatbot, clinic, livechat, or LLM site?

My paper for the JURIX 2024 conference, Measuring What Matters: Developing Human-Centered Legal Q-and-A Quality Standards through Multi-Stakeholder Research, tackles this challenge head-on. Through a series of empirical studies, the paper develops a human-centered framework for evaluating AI-generated legal answers, ensuring that quality benchmarks align with what actually helps people facing legal problems. The findings provide valuable guidance for legal aid organizations, product developers, and policymakers who are shaping the future of AI-driven legal assistance.

Why Quality Standards for AI Legal Help Matter

When people receive a legal notice — like an eviction warning or a debt collection letter — they often turn to the internet for guidance. Platforms such as Reddit’s r/legaladvice, free legal aid websites, and now AI chatbots have become common sources of legal information. However, the reliability and usefulness of these answers vary widely.

AI’s increasing role in legal Q&A raises serious questions:

  • Are AI-generated answers accurate and actionable?
  • Do they actually help users solve legal problems?
  • Could they mislead people, causing harm rather than good?

My research addresses these concerns by involving multiple stakeholders — end users, legal experts, and technologists — to define what makes a legal answer “good.”

The paper reveals several surprising insights about what actually matters when evaluating AI’s performance in legal Q&A. Here are some key takeaways that challenge conventional assumptions:

1. Accuracy Alone Isn’t Enough — Actionability Matters More

One of the biggest surprises is that accuracy is necessary but not sufficient. While many evaluations of legal AI focus on whether an answer is legally correct, the study finds that what really helps people is whether the answer provides clear, actionable steps. A technically accurate response that doesn’t tell someone what to do next is not as valuable as a slightly less precise but highly actionable answer.

Example of accuracy that is not helpful to user’s outcome:

  • AI says: “Your landlord is violating tenant laws in your state.” (Accurate but vague)
  • AI says: “You should file a response within a short time period — often 7 days. (Though this 7 days may be different depending on your exact situation.) Here’s a link to your county’s tenant protection forms and a local legal aid service.” (Actionable and useful)

2. Accurate Information Is Not Always Good for the User

The study highlights that some legal rights exist on paper but can be risky to exercise in practice — especially without proper guidance. For example, withholding rent is a legal remedy in many states if a landlord fails to make necessary repairs. However, in reality, exercising this right can backfire:

  • Many landlords retaliate by starting eviction proceedings.
  • The tenant may misapply the law, thinking they qualify when they don’t.
  • Even when legally justified, withholding rent can lead to court battles that tenants often lose if they don’t follow strict procedural steps.

This is a case where AI-generated legal advice could be technically accurate but still harmful if it doesn’t include risk disclosures. The study suggests that high-risk legal actions should always come with clear warnings about potential consequences. Instead of simply stating, “You have the right to withhold rent,” a high-quality AI response should add:

  • “Withholding rent is legally allowed in some cases, but it carries huge risks, including eviction. It’s very hard to withhold rent correctly. Reach out to this tenants’ rights organization before trying to do it on your own.”

This principle applies to other “paper rights” too — such as recording police interactions, filing complaints against employers, or disputing debts — where following the law technically might expose a person to serious retaliation or legal consequences.

Legal answers should not just state rights but also warn about practical risks — helping users make informed, strategic decisions rather than leading them into legal traps.

3. Legal Citations Aren’t That Valuable for Users

Legal experts often assume that providing citations to statutes and case law is crucial for credibility. However, both users and experts in the study ranked citations as a lower-priority feature. Most users don’t actually read or use legal citations — instead, they prefer practical, easy-to-understand guidance.

However, citations do help in one way: they allow users to verify information and use it as leverage in disputes (e.g., showing a landlord they know their rights). The best AI responses include citations sparingly and with context, rather than overwhelming users with legal references.

4. Overly Cautious Warnings Can Be Harmful

Many AI systems include disclaimers like “Consult a lawyer before taking any action.” While this seems responsible, the study found that excessive warnings can discourage people from acting at all.

Since most people seeking legal help online don’t have access to a lawyer, AI responses should avoid paralyzing users with fear and instead guide them toward steps they can take on their own — such as contacting free legal aid or filing paperwork themselves.

5. Misleading Answers Are More Dangerous Than Completely Wrong Ones

AI-generated legal answers that contain partial truths or misrepresentations are actually more dangerous than completely wrong ones. Users tend to trust AI responses by default, so if an answer sounds authoritative but gets key details wrong (like deadlines or filing procedures), it can lead to serious harm (e.g., missing a legal deadline).

The study found that the most harmful AI errors were related to procedural law — things like incorrect filing deadlines, court names, or legal steps. Even small errors in these areas can cause major problems for users.

6. The Best AI Answers Function Like a “Legal GPS”

Rather than replacing lawyers, users want AI to act like a smart navigation system — helping them spot legal issues, identify paths forward, and get to the right help. The most helpful answers do this by:

  • Quickly diagnosing the problem (understanding what the user is asking about).
  • Giving step-by-step guidance (telling the user exactly what to do next).
  • Providing links to relevant forms and local services (so users can act on the advice).

Instead of just stating the law, AI should orient users, give them confidence, and point them toward useful actions — even if that means simplifying some details to keep them engaged.

AI’s Role in Legal Help Is About Empowerment, Not Just Information

The research challenges the idea that AI legal help should be measured only by how well it mimics a lawyer’s expertise. Instead, the most effective AI legal Q&A focuses on empowering users with clear, actionable, and localized guidance — helping them take meaningful steps rather than just providing abstract legal knowledge.

Key Takeaways for Legal Aid, AI Developers, and Policymakers

The paper’s findings offer important lessons for different stakeholders in the legal AI ecosystem.

1. Legal Aid Organizations: Ensuring AI Helps, Not Hurts

Legal aid groups may increasingly rely on AI to extend their reach, but they must be cautious about its limitations. The research highlights that users want AI tools that:

  • Provide clear, step-by-step guidance on what to do next.
  • Offer jurisdiction-specific advice rather than generic legal principles.
  • Refer users to real-world resources, such as legal aid offices or court forms.
  • Are easy to read and understand, avoiding legal jargon.

Legal aid groups should ensure that the AI tools they deploy adhere to these quality benchmarks. Otherwise, users may receive vague, confusing, or even misleading responses that could worsen their legal situations.

2. AI Product Developers: Building Legal AI Responsibly & Knowing Justice Use Cases

AI developers must recognize that accuracy alone is not enough. The paper identifies four key criteria for evaluating the quality of AI legal answers:

  1. Accuracy — Does the answer provide correct legal information? And when legal information is accurate but high-risk, does it tell people about rights and options with sufficient context?
  2. Actionability — Does it offer concrete steps that the user can take?
  3. Empowerment — Does it help users feel capable of handling their problem?
  4. Strategic Caution — Does it avoid causing unnecessary fear or discouraging action?

One surprising insight is that legal citations — often seen as a hallmark of credibility — are not as critical as actionability. Users care less about legal precedents and more about what they can do next. Developers should focus on designing AI responses that prioritize usability over technical legal accuracy alone.

3. Policymakers: Regulating AI for Consumer Protection & Outcomes

For regulators, the study underscores the need for clear, enforceable quality standards for AI-generated legal guidance. Without such standards, AI-generated legal help may range from extremely useful to dangerously misleading.

Key regulatory considerations include:

  • Transparency: AI platforms should disclose how they generate answers and whether they have been reviewed by legal experts.
  • Accuracy Audits: Regulators should develop auditing protocols to ensure AI legal help is not systematically providing incorrect or harmful advice.
  • Consumer Protections: Policies should prevent AI tools from deterring users from seeking legal aid when needed.

Policymakers ideally will be in conversation with frontline practitioners, product/model developers, and community members to understand what is important to measure, how to measure it, and how to increase the quality and safety of performance. Evaluation based on concepts like Unauthorized Practice of Law does not necessarily correspond to consumers’ outcomes, needs, and priorities. Rather, figuring out what is beneficial to consumers should be based on what matters to the community and frontline providers.

The Research Approach: A Human-Centered Framework

How did we identify these insights and standards? The study used a three-part research process to hear from community members, frontline legal help providers, and access to justice experts. (Thanks to the Legal Design Lab team for helping me with interviews and study mechanics!)

  1. User Interviews: 46 community members tested AI legal help tools and shared feedback on their usefulness and trustworthiness.
  2. Expert Evaluations: 21 legal professionals ranked the importance of various quality criteria for AI-generated legal answers.
  3. AI Response Ratings: Legal experts assessed real AI-generated answers to legal questions, identifying common pitfalls and best practices.

This participatory, multi-stakeholder approach ensures that quality metrics reflect the real-world needs of legal aid seekers, not just theoretical legal standards.

The Legal Q-and-A Quality Rubric

What’s Next? Implementing the Quality Rubric

The research concludes with a proposed Quality Rubric that can serve as a blueprint for AI developers, researchers, and regulators. This rubric provides a scoring system that evaluates legal AI answers based on their strengths and weaknesses across key quality dimensions.

Potential next steps include:

  • Regular AI audits using the Quality Rubric to track performance.
  • Collaboration between legal aid groups and AI developers to refine AI-generated responses.
  • Policy frameworks that hold AI platforms accountable for misleading or harmful legal information.

Others might be developing internal quality review of the RAG-bots and AI systems on their websites and tools. They can use the rubric above as they are doing safety and quality checks, or training human labelers or AI automated judges to conduct these checks.

Conclusion: Measuring AI for Better Access to Justice

AI holds great promise for expanding access to legal help, but it must be measured and managed effectively. My research provides a concrete roadmap for ensuring that AI legal assistance is not just technically impressive but genuinely useful to people in need.

For legal aid organizations, the priority should be integrating AI tools that align with the study’s quality criteria. For AI developers, the challenge is to design products that go beyond accuracy and focus on usability, actionability, and strategic guidance. And for policymakers, the responsibility lies in crafting regulations that ensure AI-driven legal help does more good than harm.

As AI continues to transform how people access legal information, establishing clear, human-centered quality standards will be essential in shaping a fair and effective legal tech landscape.

Need for More Benchmarks of More Legal Tasks

In addition to this current focus on Legal Q-and-A, the justice community also needs to create similar evaluation standards and protocols for other tasks. Besides answering brief legal questions, there are other quality questions that matter to people’s outcomes, rights, and justice. This is the first part of a much bigger effort to have measurable, meaningful justice interventions.

This focus on delineated tasks & quality measures for each will be essential for quality products and models — serving the public — and unlocking greater scale and support of innovation.

Categories
AI + Access to Justice Current Projects

AI, Machine Translation, and Access to Justice

Lessons from Cristina Llop’s Work on Language Access in the Legal System

Artificial intelligence (AI) and machine translation (MT) are often seen as tools with the potential to expand access to justice, especially for non-English speakers in the U.S. legal system. However, while AI-driven translation tools like Google Translate and AutoML offer impressive accuracy in general contexts, their effectiveness in legal settings remains questionable.

At the Stanford Legal Design Lab’s AI and Access to Justice research webinar on February 7, 2025, legal expert Cristina Llop shared her observations from reviewing live translations between legal providers’ staff and users. Her findings highlight both the potential and pitfalls of using AI for language access in legal settings. This article explores how AI performs in practice, where it can be useful, and why human oversight, national standards, and improved training datasets are critical.

How Machine Translation Performs in Legal Contexts

Many courts and legal service providers have turned to AI-powered Neural Machine Translation (NMT) models like Google Translate to help bridge language barriers. While AI is improving, Llop’s research suggests that accuracy in general language translation does not necessarily translate to legal language accuracy.

1. The Good: AI Can Be Useful in Certain Scenarios

Machine translation tools can provide immediate, cost-effective assistance in specific legal language tasks, such as:

  • Translating declarations and witness statements
  • Converting court forms and pleadings into different languages
  • Making legal guides and court websites more accessible
  • Supporting real-time interpretation in court help centers and clerk offices

This can be especially valuable in resource-strapped courts and legal aid groups that lack human interpreters for every case. However, Llop cautions that even when AI-generated translations sound fluent, they may not be legally precise or safe to rely on.

AI doesn’t pick up on legal context and mis-translates key information about trials, filing, court, and options.

2. The Bad: Accuracy Breaks Down in Legal Contexts

Llop identified systematic mistranslations that could have serious consequences:

Common legal terms are mistranslated due to a lack of specialized training data. For example, “warrant” is often translated as “court order,” which downplays the severity of a legal document.

Contextual misunderstandings lead to serious errors:

  • “Due date” was mistranslated as “date to give birth.”
  • “Trial” was often translated as “test.”
  • “Charged with a battery a case” turned into “loaded with a case of batteries.”

Pronoun confusion creates ambiguity:

  • Spanish’s use of “su” (your/his/her/their) is often mistranslated in legal documents, leading to uncertainty about property ownership, responsibility, or court filings.
  • In restraining order cases, it was unclear who was accusing whom, which could put victims at risk.

AI can introduce gender biases:

  • Words with no inherent gender (e.g., “politician”) are often translated as male.
  • The Spanish “Me Maltrata”, which could be translated either as She mistreats me or He mistreats me — without the gender being specified. The machine would default “me maltrata” as “He mistreats me,” potentially distorting evidence in domestic violence cases.

Without human review, these AI-driven errors can go unnoticed, leading to severe legal consequences.

The Dangers of Mistranslation in Legal Interactions

One of the most troubling findings from Llop’s work was the invisible breakdowns in communication between legal providers and non-English speakers.

1. Parallel Conversations Instead of Communication

In many cases, both parties believed they were exchanging information, but in reality:

  • Legal providers were missing key facts from litigants.
  • Users did not realize that their information was misunderstood or misrepresented.
  • Critical details — such as the nature of an abuse claim or financial disclosures — were being lost.

This failure to communicate accurately could result in:

  • People choosing the wrong legal recourse and misunderstanding what options are available to them.
  • Legal provider staff making decisions based on incomplete or distorted information, providing services and option menus based on misunderstandings about the person’s scenario or preferences.
  • Access to justice being compromised for vulnerable litigants.

2. Why a Glossary Isn’t Enough

Some legal institutions have tried to mitigate errors by adding legal glossaries to machine translation tools. However, Llop’s research found that glossary-based corrections do not always solve the problem:

  • Example 1: The word “address” was provided to the AI to ensure translation to “mailing address” (instead of “home address”) in one context — but then mistakenly applied when a clerk asked, “What issue do you want to address?”
  • Example 2: “Will” (as in a legal document) was mistranslated when applied to the auxiliary verb “will” in regular interactions (“I will send you this form”).
  • Example 3: A glossary fix for “due date” worked .
  • “Example 4: A glossary fix for “pleading” worked but failed to adjust grammatical structure or pronoun usage.”

These patchwork fixes are not enough. More comprehensive training, oversight, and quality control are needed.

Advancing Legal Language AI: AutoML and Human Review

One promising improvement is AutoML, which allows legal organizations to train machine translation models with their own specialized legal data.

AutoML: A Step Forward, But Still Flawed

Llop’s team worked on an AutoML project by:

  1. Collecting 8,000+ legal translation pairs from official legal sources that had been translated by experts.
  2. Correcting AI-generated translations manually.
  3. Feeding improved translations back into the model.
  4. Iterating until translations were more accurate.

Results showed that AutoML improved translation quality, but major issues remained:

  • AI struggled with conversational context. If a prior sentence referenced “my wife,” but the next message about the wife didn’t specify gender, AI might mistakenly switch the pronoun to “he”.
  • AI overfit to common legal phrases, inserting “petition” even when the correct translation should have been “form.”

These challenges highlight why human review is essential.

Real-Time Machine Translation

While text-based AI translation can be refined over time, real-time translation — such as voice-to-text systems in legal offices — presents even greater challenges.

Voice-to-Text Lacks Punctuation Awareness

People do not dictate punctuation, but pauses and commas can change legal meaning. For example:

  • “I’m guilty” vs. “I’m not guilty” (missing comma error).
  • Minor misspellings or poor grammar can dramatically change a translation.

AI Struggles with Speech Patterns

Legal system users come from diverse linguistic backgrounds, making real-time translation even more difficult. AI performs poorly when users:

  • Speak quickly or use filler words (“um,” “huh,” “oh”).
  • Have soft speech or heavy accents.
  • Use sentence structures influenced by indigenous or regional dialects.

These challenges make it clear that AI faces major challenges in performing accurately in high-stakes legal interactions.

The Need for National Standards and Training Datasets

Llop’s research underscores a critical gap: there are no national standards, training datasets, or quality benchmark datasets for legal translation AI.

A National Legal Translation Project

Llop saw an opportunity for improvement if there were to be:

  • A centralized effort to collect high-quality legal translation pairs.
  • State-specific localization of legal terms.
  • Guidelines for AI usage in courts, legal aid orgs, and other institutions.

Such a standardized dataset could train AI more effectively while ensuring legal accuracy.

Training for English-Only Speakers

English-speaking legal provider staff need training on how to structure their speech for better AI translation:

  • Using plain language and short sentences.
  • Avoiding vague pronouns (“his, her, their”).
  • Confirming meaning before finalizing translations.

AI, Human Oversight, and National Infrastructure in Legal Translation

Machine translation and AI can be useful, but they are far from perfect. Without human review, legal expertise, and national standards, AI-generated translations could compromise access to justice.

Llop’s work highlights the urgent need for:

  1. Human-in-the-loop AI translation.
  2. Better training data tailored for legal contexts.
  3. National standards for AI language access.

As AI continues to evolve, it must be designed with legal precision and human oversight — because in law, a mistranslation can change lives.

Get in touch with Cristina Llop to learn more about her work & vision for better language access: https://www.linkedin.com/in/cristina-llop-75749915/

Thanks to her for terrific, detailed presentation at the AI+A2J Research series. Sign up to come to future Zoom webinars in our series.Find out

more about the Stanford Legal Design Lab’s work on AI & Access to justice here.

Categories
AI + Access to Justice Current Projects

Jurix ’24 AI + A2J Schedule

On December 11, 2024, in Brno, Czechia & online, we held our second annual AI for Access to Justice Workshop at the JURIX Conference.

The academic workshop is organized by Quinten Steenhuis, Suffolk University Law School/LIT Lab, Margaret Hagan, Stanford Law School/ Legal Design Lab, and Hannes Westermann, Maastricht University Faculty of Law.

In Autumn 2024, there was a very competitive application process, and 22 papers and 5 demo’s were selected.

The following presentations all come with a 10-page research paper or a shorter paper for the demo’s. The accepted paper drafts are available at this Google Drive folder.

Thank you to all of the contributors and participants in the workshop!

Session 1: AI for A2J Planning – Risks, Limits, Strategies

  • LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries: Michal Kuk and Jakub Harašta
  • Spreading the Risk of Scalable Legal Services: The Role of Insurance in Expanding Access to Justice, David Chriki, Harel Omer and Roee Amir
  • Exploring the potential and limitations of AI to enhance children’s access to justice, Boglárka Jánoskúti Dr. and Dóra Kiss Dr.
  • Health Insurance Coverage Rules Interpretation Corpus: Law, Policy, and Medical Guidance for Health Insurance Coverage Understanding, Mike Gartner

Session 2: AI for Legal Aid Services – Part A

  • Utilizing Large Language Models for Legal Aid Triage, Amit Haim and Christoph Engel
  • Measuring What Matters: Developing Human-Centered Legal Q-and-A Quality Standards through Multi-Stakeholder Research, Margaret Hagan
  • Demo: Digital Transformation in Child and Youth Welfare: A Concept for Implementing a Web-based Counseling Assistant Florian Gerlach

Session 3: AI for Legal Aid Services – Part B

  • Demo: Green Advice: Using RAG for Actionable Legal Information, Repairosaurus Rex , Nicholas Burka, Ali Cook, Sam Flynn, Sateesh Nori
  • Demo: ​​Inclusive AI design for justice in low-literacy environments, Avanti Durani and Shivani Sathe
  • Managing Administrative Law Cases using an Adaptable Model-driven Norm-enforcing Tool, Marten Steketee, Nina Verheijen and L. Thomas van Binsbergen
  • A Legal Advisor Bot Towards Access to Justice: Adam Kaczmarczyk, Tomer Libal and Aleksander Smywiński-Pohl
  • Electrified Apprenticeship: An AI Learning Platform for Law Clinics and Beyond: Brian Rhindress and Matt Samach

Session 4: NLP for access to justice

  • Demo: LIA: An AI-Powered Legal Information Assistant to Close the Access to Justice Gap, Scheree Gilchrist and Helen Hobson
  • Using Chat-GPT to Extract Principles of Law for the Sake of Prediction: an Exploration conducted on Italian Judgments concerning LGBT(QIA+) Rights, Marianna Molinari, Marinella Quaranta, Ilaria Angela Amantea and Guido Governatori
  • Legal Education and Knowledge Accessibility by Legal LLM, Sieh-Chuen Huang, Wei-Hsin Wang, Chih-Chuan Fan and Hsuan-Lei Shao
  • Evaluating Generative Language Models with Argument Attack Chains, Cor Steging, Silja Renooij and Bart Verheij

Session 5: Data quality, narratives, and safety issues

  • Potential Risks of Using Justice Tech within the Colombian Judicial System in a Rural Landscape, Maria Gamboa
  • Decoding the Docket: Machine Learning Approaches to Party Name Standardization, Logan Pratico
  • Demo: CLEO’s narrative generator prototype: Using GenAI to help unrepresented litigants tell their stories, Erik Bornmann
  • Analyzing Images of Legal Documents: Toward Multi-Modal LLMs for Access to Justice: Hannes Westermann and Jaromir Savelka
Categories
AI + Access to Justice Class Blog Current Projects

Class Presentations for AI for Legal Help

Last week, the 5 student teams in Autumn Quarter’s AI for Legal Help made their final presentations, about if and how generative AI could assist legal aid, court & bar associations in providing legal help to the public.

The class’s 5 student groups have been working over the 9-week quarter with partners including the American Bar Association, Legal Aid Society of San Bernardino, Neighborhood Legal Services of LA, and LA Superior Court Help Center. The partners came to the class with some ideas, and the student teams worked with them to scope & prototype new AI agents to do legal tasks, including:

  • Demand letters for reasonable accommodations
  • Motions to set aside to stop an impending eviction/forcible set-out
  • Triaging court litigants to direct them to appropriate services
  • Analyzing eviction litigants’ case details to spot defenses
  • Improving lawyers’ responses to online brief advice clinic users’ questions

The AI agents are still in early stages. We’ll be continuing refinement, testing, and pilot-planning next quarter.

Categories
AI + Access to Justice Current Projects

AI + Access to Justice Summit 2024

On October 17 and 18, 2024 Stanford Legal Design Lab hosted the first-ever AI and Access to Justice Summit.

The Summit’s primary goal was to build strong relationships and a national, coordinated roadmap of how AI can responsibly be deployed and held accountable to close the justice gap.

AI + A2J Summit at Stanford Law School

Who was at the Summit?

Two law firm sponsors, K&L Gates and DLA Piper, supported the Summit through travel scholarships, program costs, and strategic guidance.

The main group of invitees were frontline legal help providers at legal aid groups, law help website teams, and the courts. We know they are key players in deciding what kinds of AI should and could be impactful for closing the justice gap. They’ll also be key partners in developing, piloting, and evaluating new AI solutions.

Key supporters and regional leaders from bar foundations, philanthropies, and pro bono groups were also invited. Their knowledge about funding, scaling, past initiatives, and spreading projects from one organization and region to others was key to the Summit.

Technology developers also came, both from big technology companies like Google and Microsoft and legal technology companies like Josef, Thomson Reuters, Briefpoint, and Paladin. Some of these groups already have AI tools for legal services, but not all of them have focused in on access to justice use cases.

In addition, we invited researchers who are also developing strategies for responsible, privacy-forward, efficient ways of developing specialized AI solutions that could help people in the justice sphere, and also learn from how AI is being deployed in parallel fields like in medicine or mental health.

Finally, we had participants who work in regulation and policy-making at state bars, to talk about policy, ethics, and balancing innovation with consumer protection. The ‘rules of the road’ about what kinds of AI can be built and deployed, and what standards they need to follow, are essential for clarity and predictability among developers.

What Happened at the Summit?

The Summit was a 2-day event, split intentionally into 5 sections:

  • Hands-On AI Training: Examples and Research to upskill legal professionals. There were demo’s, explainers, and strategies about what AI solutions are already in use or possible for legal services. Big tech, legal tech, and computer science researchers presented participants with hands-on, practical, detailed tour of AI tools, examples, and protocols that can be useful in developing new solutions to close the justice gap.
  • Big Vision: Margaret Hagan and Richard Susskind opened up the 2nd day with a challenge: where does the access to justice community want to be in 2030 when it comes to AI and the justice gap? How can individual organizations collaborate, build common infrastructure, and learn from each other to reach our big-picture goals?
  • AI+A2J as of 2024: In the morning of the second day, two panels presented on what is already happening in AI and Access to Justice — including an inventory of current pilots, demo’s of some early legal aid chatbots, regulators’ guidelines, and innovation sandboxes. This can help the group all understand the early-stage developments and policies.
  • Design & Development of New Initiatives. In the afternoon of the second day, we led breakout design workshops on specific use cases: housing law, immigration law, legal aid intake, and document preparation. The diverse stakeholders worked together using our AI Legal Design workbook to scope out a proposal for a new solution — whether that might mean building new technology or adapting off-the-shelf tech to the needs.
  • Support & Collaboration. In the final session, we heard from a panel who could talk through support: financial support, pro bono partnership support, technology company licensing and architecture support, and other ways to build more new interdisciplinary relationships that could unlock the talent, strategy, momentum, and finances necessary to make AI innovation happen. We also discussed support around evaluation so that there could be more data and more feeling of safety in deploying these new tools.

Takeaways from the Summit

The Summit built strong relationships & common understanding among technologists, providers, researchers, and supporters. Our hope is that we can run the Summit annually, to track progress in tackling the justice gap with AI and to observe what progress has been made, year-to-year. It is also to see the development of these relationships, collaborations, and scaling of impact.

In addition, some key points emerged from the training, panels, workshops, and down-time discussions.

Common Infrastructure for AI Development

Though many AI pilots are going to have be local to a specific organization in a specific region, the national (or international) justice community can be working on common resources that can serve as infrastructure to support AI for justice.

  • Common AI Trainings: Regional leaders, who are newly being hired by state bars and bar foundations to train and explore how AI can fit with legal services, should be working together to develop common training, common resources, and common best practices.
  • Project Repository: National organizations and networks should be thinking about a common repository of projects. This inventory could track what tech provider is being used, what benchmark is being used for evaluation, what AI model is being deployed, what date it was fine-tuned on, and if and how others could replicate it.
  • Rules of the Road Trainings. National organizations and local regulators could give more guidance to leadership like legal aid executive directors about what is allowed or not allowed, what is risky or safe, or other clarification that can help more leadership be brave and knowledgeable about how to deploy AI responsibly. When is an AI project sufficiently tested to be released to the public? How should the team be maintaining and tracking an AI project, to ensure it’s mitigating risk sufficiently?
  • Public Education. Technology companies, regulators, and frontline providers need to be talking more about how to make sure that the AI that is already out there (like ChatGPT, Gemini, and Claude) is reliable, has enough guardrails, and is consumer-safe. More research needs to be done on how to encourage strategic caution among the public, so they can use the AI safely and avoid user mistakes with it (like overreliance or misunderstanding).
  • Regulators<->Frontline Providers. More frontline legal help providers need to be in conversation with regulators (like bar associations, attorneys general, or other state/federal agencies) to talk about their perspective on if and how AI can be useful in closing the justice gap. Their perspective on risks, consumer harms, opportunities, and needs from regulators can ensure that rules are being set to maximize positive impact and minimize consumer harm & technology chilling.
  • Bar Foundation Collaboration. Statewide funders (especially bar foundations) can be talking to each other about their funding, scaling, and AI strategies. Well-resourced bar foundations can share how they are distributing money, what kinds of projects they’re incentivizing, how they are holding the projects accountable, and what local resources or protocols they could share with others.

AI for Justice Should be Going Upstream & Going Big

Richard Susskind charged the group with thinking big about AI for justice. His charges & insights inspired many of the participants throughout the Summit, particularly on two points.

Going Big. Susskind called on legal leaders and technologists not to do piecemeal AI innovation (which might well be the default pathway). Rather, he called on them to work in coordination across the country (if not the globe). The focus should be on reimagining how to use AI as a way to make a fundamental, beneficial shift in justice services. This means not just doing small optimizations or tweaks, but shifting the system to work better for users and providers.

Susskind charged us with thinking beyond augmentation to models of serving the public with their justice needs.

Going Upstream. He also charged us with going upstream, figuring out more early ways to spot and get help to people. This means not just adding AI into the current legal aid or court workflow — but developing new service offerings, data links, or community partnerships. Can we prevent more legal problems by using AI before a small problem spirals into a court case or large conflict?

After Susskind’s remarks, I focused in on coordination among legal actors across the country for AI development. Compared to the last 20 years of legal technology development, are there ways to be more coordinated, and also more focused on impact and accountability?

There might be strategic leaders in different regions of the US and in different issue areas (housing, immigration, debt, family, etc) that are spreading

  • best practices,
  • evaluation protocols and benchmarks,
  • licensing arrangements with technology companies
  • bridges with the technology companies
  • conversations with the regulators.

How can the Access to Justice community be more organized so that their voice can be heard as

  • the rules of the road are being defined?
  • technology companies are building and releasing models that the public is going to be using?
  • technology vendors decide if and how they are going to enter this market, and what their pricing and licensing are going to look like?

Ideally, legal aid groups, courts, and bars will be collaborating together to build AI models, agents, and evaluations that can get a significant number of people the legal help they need to resolve their problems — and to ensure that the general, popular AI tools are doing a good job at helping people with their legal problems.

Privacy Engineering & Confidentiality Concerns

One of the main barriers to AI R&D for justice is confidentiality. Legal aid and other help providers have a duty to keep their clients’ data confidential, which restricts their ability to use past data to train models or to use current data to execute tasks through AI. In practice, many legal leaders are nervous about any new technology that requires client data — -will it lead to data leaks, client harms, regulatory actions, bad press or other concerning outcomes?

Our technology developers and researchers had cutting-edge proposals for privacy-forward AI development, that could deal with some of these concerns around confidentiality. THough these privacy engineering strategies are foreign to many lawyers, the technologists broke them down into step-by-step explanations with examples, to help more legal professionals be able to think about data protection in a systematic, engineering way.

Synthetic Data. One of the privacy-forward strategies discussed was synthetic data. With this solution, a developer doesn’t use real, confidential data to train a system. Rather, they create a parallel but fictional set of data — like a doppelganger to the original client data. It’s structurally similar to confidential client data, but it contains no real people’s information. Synthetic data is a common strategy in healthcare technology, where there is a similar emphasis on patient confidentiality.

Neel Guha explained to the participants how synthetic data works, and how they might build a synthetic dataset that is free of identifiable data and does not violate ethical duties to confidentiality. He emphasized that the more legal aid and court groups can develop datasets that are share-able to researchers and the public, the more that researchers and technologists will be attracted to working on justice-tech challenges. More synthetic datasets will both be ethically safe & beneficial to collaboration, scaling, and innovation.

Federated Model Training. Another privacy/confidentiality strategy was Federated Model Training. Google DeepMind team presented on this strategy, taking examples from the health system.

When multiple hospitals all wanted to work on the same project: training an AI model to better spot tuberculosis or other issues on lung X-rays. Each hospital wanted to train the AI model on their existing X-ray data, but they did not want to let this confidential data to leave their servers and go to a centralized server. Sharing the data would break their confidentiality requirements.

So instead, the hospitals decided to go with a Federated Model training protocol. Here, an original, first version of the AI model was taken from the centralized server and then put on each of the hospital’s localized servers. The local version of the AI model would look at that hospital’s X-ray data and train the model on them. Then they would send the model back to the centralized server and accumulate all of the learnings and trainings to make a smart model in the center. The local hospital data was never shared.

In this way, legal aid groups or courts could explore making a centralized model while still keeping each of their confidential data sources on their private, secure servers. Individual case data and confidential data stay local on the local servers, and the smart collective model lives at a centralized place and gradually gets smarter. This technique can also work for training the model over time so that the model can continue to get smart as the information and data continue to grow.

Towards the Next Year of AI for Access to Justice

The Legal Design Lab team thanks all of our participants and sponsors for a tremendous event. We learned so much and built new relationships that we look forward to deepening with more collaborations & projects.

We were excited to hear frontline providers walk away with new ideas, concrete plans for how to borrow from others’ AI pilots, and an understanding of what might be feasible. We were also excited to see new pro bono and funding relationships develop, that can unlock more resources in this space.

Stay tuned as we continue our work on AI R&D, evaluation, and community-building in the access to justice community. We look forward to working towards closing the justice gap, through technology and otherwise!

Categories
AI + Access to Justice Current Projects

Roadmap for AI and Access to Justice

Our Lab is continuing to host meetings & participate in others to scope out what kinds of work needs to happen to make AI work for access to justice.

We will be making a comprehensive roadmap of tasks and goals.

Here is our initial draft — that divides the roadmap between Cross-Issue Tasks (that apply across specific legal problem/policy areas) and Issue-Specific Tasks (where we are still digging into specifics).

These different tasks each might be its own branch of AI agents & evaluation.

Stay tuned for further refinement and testing of this roadmap!

Categories
AI + Access to Justice Current Projects

Share Your AI + Justice Idea

Our team at Legal Design Lab is building a national network of people working on AI projects to close the justice gap, through better legal services & information.

We’re looking to find more people working on innovative new ideas & pilots. Please share with us below using the form.

The idea could be for:

  • A new AI tool or agent, to help you do a specific legal task
  • A new or finetuned AI model for use in the legal domain
  • A benchmark or evaluation protocol to measure the performance of AI
  • A policy or regulation strategy to protect people from AI harms and encourage responsible innovation
  • A collaboration or network initiative to build a stronger ecosystem of people working on AI & justice
  • Another idea you have to improve the development, performance & consumer safety of AI in legal services.

Please be in touch!

Categories
AI + Access to Justice Current Projects

Summit schedule for AI + Access to Justice

This October, Stanford Legal Design Lab hosted the first AI + Access to Justice Summit. This invite-only event focused on building a national ecosystem of innovators, regulators, and supporters to guide AI innovation toward closing the justice gap, while also protecting the public.

The Summit’s flow aimed to teach frontline providers, regulators, and philanthropists about current projects, tools, and protocols to develop impactful justice AI. We did this with hands-on trainings on AI tools, platforms, and privacy/efficiency strategies. We layered on tours of what’s happening with legal aid and court help pilots, and what regulators and foundations are seeing with AI activity by lawyers and the public.

We then moved from review and learning to creative work. We workshoped how to launch new individual model & agent pilots, while weaving a coordinated network with shared infrastructure, models, benchmarks, and protocols. We closed the day with discussion about support — how to mobilize the financial resources, interdisciplinary relationships, and affordable technology access.

Our goal was to launch a coordinated, inspired, strategic cohort, working together across the country to set out a common, ambitious vision. We are so thankful that so many speakers, supporters, and participants joined us to launch this network & lay the groundwork for great work yet to come.

Categories
AI + Access to Justice Current Projects

Housing Law experts wanted for AI evaluation research

We are recruiting Housing Law experts to participate in a study of AI answers to landlord-tenant questions. Please sign up here if you are a housing law practitioner interested in this study.

Experts who participate in interviews and AI-ranking sessions will receive Amazon gift cards for their participation.