
The Stanford Legal Design Lab has launched the AI and Access to Justice Initiative to conduct user research, system evaluation, network coordination & innovative R&D in this exciting space.
As more justice professionals and community members become aware of AI, the Legal Design Lab is conducting cutting-edge R&D work on how AI systems perform when people try to use them for legal problem-solving and how we can build smarter, more responsible AI for access to justice.
If you want to join our AI+A2J Interest List, please sign up here to stay in touch.
Explore the Legal Design Lab’s AI & Access to Justice Initiative
This page is the main hub for the AI-A2J Initiative. Choose from the categories here to find about our specific initiatives on AI & Access to Justice.
If you are interested in AI & Access to Justice, sign up using this form to stay updated with work, opportunities and events in this space.

Explore the Legal Design Lab’s current projects on AI & Access to Justice.

Sign up to join our network and see what events are coming up around AI & Access to Justice.

Find the latest academic courses the Lab is offering on AI and Access to Justice, as well as past class reports.

Explore our Lab’s latest research findings, other groups’ publications, and research opportunities.

Projects on AI & Access to Justice
Our team is exploring how AI models and agents might improve the justice system.
This includes doing research, community design, and design/development of new AI that can responsibly help empower people regarding their legal rights — and empower service providers to make the justice system more accessible and equitable.
The Legal Design Lab is working both on the infrastructure of responsible AI (gathering data, creating labeled datasets, establishing taxonomies, creating benchmarks, and finding pilot partners) as well as the design & development of new AI tech/service pilots.
JusticeBench
An R&D platform for people working on AI & Access to Justice
The Legal Design Lab has launched the JusticeBench platform to share:
- A common agenda of tasks that the community needs AI to be able to do, to support legal teams and users with key justice tasks
- An inventory of existing projects, pilots, and proposals on AI to improve justice outcomes
- Datasets to be used in benchmarking, evaluation, and training better systems
- Resources and guides to navigate this space, in terms of effective change management, professional ethics, and more.
Please explore these resources — whether you are a legal expert looking to understand how to develop AI solutions in coordination with others, or a technical expert interested in understanding the legal services domain.
Also, tell us about AI project ideas that you have — whether it’s new agents, models, benchmarks, policies, training, or more. We’d love to hear what you’re working on.

Justice AI Co-Pilot R&D for Eviction Defense and Reentry Services
The Legal Design Lab is working with legal aid and court groups around the country to scope, prototype, and pilot new AI tools to help legal professionals better serve more of the public.
One focus is on tools for Housing Law — particularly legal groups helping people with eviction defense, living condition problems, and other landlord-tenant issues.
We are also working on Reentry services and Debt Relief, to help providers assist people with legal financial obligations and debts they are struggling with after going through the criminal justice system.
Please be in touch with us if you are interested in these areas or would like to learn more.
Please also write if you are working on AI models, agents, or benchmarks that might be relevant.
Read about our newly-launched Justice AI Co-Pilot project, in which we are working with legal aid groups on eviction defense and reentry debt.

Quality Evaluation for AI Legal Q-and-A
What makes an AI answer to a legal question good or bad?
Our Lab (including with our amazing students in Winter Quarter’s AI for Legal Help class) has been establishing a common set of practical, measurable criteria by which we can evaluate AI Legal Q&A.
We are recruiting Housing Law experts to participate in a study of AI answers to landlord-tenant questions. Please sign up here if you are a housing law practitioner interested in this study!
This work has included:
- Interviewing justice professionals (legal aid lawyers, law help website administrators, live chat operators, hotline staffers, court help center staff, law professors, bar officials, and more) about what criteria are important (or not) to prioritize when evaluating AI’s answers to normal people’s questions about their legal problems.
- Making a draft list of the top criteria, by which justice professionals say we should be measuring AI models’ Legal Q-and-A performance
- Building a ‘rating game’ — with Q-and-A pairs of real people’s legal questions (taken from our past user interviews) and different AI models’ answers, and then putting the expert-approved criteria into the game. (All of this built on Learned Hands)
- Having justice professionals play this rating game, and narrate aloud why they are rating the Q-and-A pairs as they do. This helps us both get a labeled dataset of which AI answers are good or bad, and how they perform on more specific criteria — but also help us refine our criteria & understand the logic behind the professionals’ ratings.
- Exploring which of the criteria might be evaluated automatically, and exactly how we could safely automate the evaluation.
As we finalize this work, we will be sharing more about the academic publications, white papers, and technical proposals that emerge out of our work.
See more details at our AI+A2J Metrics page here.
Read Margaret Hagan’s recent article & research summary on Measuring What Matters.


User Research into AI for Legal Help
The Legal Design Lab, including in its Policy Lab/d.school class AI for Legal Help, has been interviewing adults across America about if & how they would use AI for legal-problem solving.
With our online and in-court interviews, we ask people — if they had a fictional legal problem — if they would use AI to address it, and if they do, we watch over their shoulder as they try to use it to address this fictional legal problem.
This research aims to identify:
- if people imagine they will find AI helpful and trustworthy for legal problem-solving, in the abstract
- as they do have an interaction with an AI platform if they do indeed find it helpful and trustworthy
- what people identify as making an AI’s responses helpful, what makes it harmful, and how much warning or disclosures they want
- what the ideal AI tool might be, from different people’s perspectives.
We will be publishing our research regularly, and continuing to run interviews as the AI platforms change and as people’s perceptions and use of AI change.
Read the Autumn 2023 report on user research around AI for legal help here.
See our interview data dashboard & raw data at our AI+A2J User Research page.
See our publications in the Research & Publications section.
Learned Hands game to label people’s legal issues
Learned Hands is an online game to build labeled datasets, machine learning models, and new applications that can connect people online to high-quality legal help. Our team at Legal Design Lab partnered with the team at Suffolk LIT Lab to build it, with the support of the Pew Charitable Trusts.
Playing the Learned Hands game lets you label people’s stories with a standardized list of legal issue codes. It’s a mobile-friendly web application that you’re welcome to come play and earn pro bono credit with.
The game produces a labeled dataset of people’s stories, tagged with the legal issues that apply to their situation. This dataset can be used to develop AI tools like classifiers to automatically spot people’s issues.
Read more about the Learned Hands project here.
AI/Legal Help Problem Incident Database
The Legal Design Lab is compiling a database of “AI & Legal Help problem incidents”. Please contribute to this database by entering in information on this form, that feeds into the database.
We will be making this database available in the near-future, as we collect more records & review them.For this database, we’re looking for specific examples of where AI platforms (like ChatGPT, Bard, Bing Chat, etc) provide problematic responses, like:
- incorrect information about legal rights, rules, jurisdiction, forms, or organizations;
- hallucinations of cases, statutes, organizations, hotlines, or other important legal information;
- irrelevant, distracting, or off-topic information;
- misrepresentation of the law;
- overly simplified information, that loses key nuance or cautions;
- otherwise doing something that might be harmful to a person trying to get legal help.
You can send in any incidents you’ve experienced here at this form.
Recent Posts on AI & Access to Justice
AI + Legal Help 2026 class

We are happy to announce the launch of our fourth round of the class “AI for Legal Help”. It is cross-listed at Stanford Law School and Design School. Students will be working with real-world, public interest legal groups to develop AI solutions in a responsible, practical way — that can help scale out high-need legal services. Here is the class description: Want to build AI that actually…
The Legal Help Task Taxonomy at Jurix ’25

What Legal Help Actually Requires: Building a Task Taxonomy for AI, Research, and Access to Justice In December 2025, I presented a new piece of research at the JURIX Conference in Turin, Italy, as part of the workshop on AI, Dispute Resolution, and Access to Justice. The workshop brought together legal scholars, technologists, and practitioners from around the world to examine how artificial intelligence is already shaping…
AI+A2J 2025 Summit Takeaways

Table of ContentsThe Arc of the SummitOur Key AI + A2J Ecosystem MomentWhat will 2030 look like for A2J?What is stopping great innovation imapact?3 Levels of Strategic Work to Set us towards a Good EcosystemStrategy Level 1: Internal Org Strategy around AIStrategy 2: Ecosystem StrategyStrategy 3: Towards Big Tech & A2JThe Modern A2J Toolbox: A Growing set of AI-Powered SolutionsResearch & Case Management AssistantsAI on Case Management…
Jurix 2025 AIDA2J Workshop

The Stanford Legal Design Lab is so happy to be a sponsoring co-host of the third consecutive AI and Access to Justice workshop at the JURIX conference. This round, the conference is at Turin, Italy in December 2025. The theme is AI, Dispute Resolution, and Access to Justice. See the main workshop website here. The workshop will involve the collaboration of the Suffolk LIT Lab, the Stanford Legal Design…
AI+A2J Summit 2025

The Stanford Legal Design Lab hosted the second annual AI and Access to Justice Summit on November 20-21, 2025. Over 150 legal professionals, technologists, regulators, strategists, and funders came together to tackle one big question: how can we build a strong, sustainable national/international AI and Access to Justice Ecosystem? We will be synthesizing all of the presentations, feedback, proposals and discussions into a report that lays out:…
Legal Aid Intake & Screening AI

A Report on an AI-Powered Intake & Screening Workflow for Legal Aid Teams AI for Legal Help, Legal Design Lab, 2025 This report provides a write-up of the AI for Housing Legal Aid Intake & Screening class project, that was one track of the “AI for Legal Help” Policy Lab, during the Autumn 2024 and Winter 2025 quarters. The AI for Legal Help course involved work with legal…
Demand Letter AI

A prototype report on an AI-Powered Drafting of Reasonable Accommodation Demand Letters AI for Legal Help, Legal Design Lab, 2025 This report provides a write-up of the AI for Housing Accommodation Demand Letters class project, that was one track of the “AI for Legal Help” Policy Lab,during the Autumn 2024 and Winter 2025 quarters. This class involved work with legal and court groups that provide legal help…
A Call for Statewide Legal Help AI Stewards

Shaping the Future of AI for Access to Justice By Margaret Hagan, originally published on Legal Design & Innovation If AI is going to advance access to justice rather than deepen the justice gap, the public-interest legal field needs more than speculation and pilots — we need statewide stewardship. 2 missions of an AI steward, for a state’s legal help service provider community We need specific people and…
Human-Centered AI R&D at ICAIL’s Access to Justice Workshop

By Margaret Hagan, Executive Director of the Legal Design Lab At this year’s International Conference on Artificial Intelligence and Law (ICAIL 2025) in Chicago, we co-hosted the AI for Access to Justice (AI4A2J) workshop—a full-day gathering of researchers, technologists, legal practitioners, and policy experts, all working to responsibly harness artificial intelligence to improve public access to justice. The workshop was co-organized by an international team: myself (Margaret…
Can LLMs help streamline legal aid intake?

Insights from Quinten Steenhuis at the AI + Access to Justice Research Seminar Recently, the Stanford Legal Design Lab hosted its latest installment of the AI+Access to Justice Research Seminar, featuring a presentation from Quinten Steenhuis. Quinten is a professor and innovator-in-residence at Suffolk Law School’s LIT Lab. He’s also a former housing attorney in Massachusetts who has made a significant impact with projects like Court Forms…
Justice AI Co-Pilots

The Stanford Legal Design Lab is proud to announce a new initiative funded by the Gates Foundation that aims to bring the power of artificial intelligence (AI) into the hands of legal aid professionals. With this new project, we’re building and testing AI systems—what we’re calling “AI co-pilots”—to support legal aid attorneys and staff in two of the most urgent areas of civil justice: eviction defense and…
ICAIL workshop on AI & Access to Justice

The Legal Design Lab is excited to co-organize a new workshop at the International Conference on Artificial Intelligence and Law (ICAIL 2025): AI for Access to Justice (AI4A2J@ICAIL 2025)📍 Where? Northwestern University, Chicago, Illinois, USA🗓 When? June 20, 2025 (Hybrid – in-person and virtual participation available)📄 Submission Deadline: May 4, 2025📬 Acceptance Notification: May 18, 2025 Submit a paper here https://easychair.org/cfp/AI4A2JICAIL25 This workshop brings together researchers, technologists,…
How AI is Augmenting Human-Led Legal Advice at Citizens Advice

Caddy Chatbot to Support Supervision of Legal Advisers and Improve Q&A The Citizens Advice network in England and Wales is a cornerstone of free legal and social support, with a network of 270 local organizations operating across 2,540 locations. In 2024 alone, it provided advice to 2.8 million people via phone, email, and web chat. However, the rising cost-of-living crisis in the UK has increased the demand…
Measuring What Matters: A Quality Rubric for Legal AI Answers

by Margaret Hagan, Executive Director of the Legal Design Lab Measuring What Matters: A Quality Rubric for Legal AI Answers As more people turn to AI for legal advice, a pressing issue emerges: How do we know whether AI-generated legal answers are actually helpful? While legal professionals and regulators may have instincts about good and bad answers, there has been no clear, standardized way to evaluate AI’s performance…
AI, Machine Translation, and Access to Justice

Lessons from Cristina Llop’s Work on Language Access in the Legal System Artificial intelligence (AI) and machine translation (MT) are often seen as tools with the potential to expand access to justice, especially for non-English speakers in the U.S. legal system. However, while AI-driven translation tools like Google Translate and AutoML offer impressive accuracy in general contexts, their effectiveness in legal settings remains questionable. At the Stanford…
Jurix ’24 AI + A2J Schedule

On December 11, 2024, in Brno, Czechia & online, we held our second annual AI for Access to Justice Workshop at the JURIX Conference. The academic workshop is organized by Quinten Steenhuis, Suffolk University Law School/LIT Lab, Margaret Hagan, Stanford Law School/ Legal Design Lab, and Hannes Westermann, Maastricht University Faculty of Law. In Autumn 2024, there was a very competitive application process, and 22 papers and…
Class Presentations for AI for Legal Help

Last week, the 5 student teams in Autumn Quarter’s AI for Legal Help made their final presentations, about if and how generative AI could assist legal aid, court & bar associations in providing legal help to the public. The class’s 5 student groups have been working over the 9-week quarter with partners including the American Bar Association, Legal Aid Society of San Bernardino, Neighborhood Legal Services of…
AI + Access to Justice Summit 2024

On October 17 and 18, 2024 Stanford Legal Design Lab hosted the first-ever AI and Access to Justice Summit. The Summit’s primary goal was to build strong relationships and a national, coordinated roadmap of how AI can responsibly be deployed and held accountable to close the justice gap. AI + A2J Summit at Stanford Law School Who was at the Summit? Two law firm sponsors, K&L Gates…
Roadmap for AI and Access to Justice

Our Lab is continuing to host meetings & participate in others to scope out what kinds of work needs to happen to make AI work for access to justice. We will be making a comprehensive roadmap of tasks and goals. Here is our initial draft — that divides the roadmap between Cross-Issue Tasks (that apply across specific legal problem/policy areas) and Issue-Specific Tasks (where we are still…
Share Your AI + Justice Idea

Our team at Legal Design Lab is building a national network of people working on AI projects to close the justice gap, through better legal services & information. We’re looking to find more people working on innovative new ideas & pilots. Please share with us below using the form. The idea could be for: A new AI tool or agent, to help you do a specific legal…
Summit schedule for AI + Access to Justice

This October, Stanford Legal Design Lab hosted the first AI + Access to Justice Summit. This invite-only event focused on building a national ecosystem of innovators, regulators, and supporters to guide AI innovation toward closing the justice gap, while also protecting the public. The Summit’s flow aimed to teach frontline providers, regulators, and philanthropists about current projects, tools, and protocols to develop impactful justice AI. We did…
Housing Law experts wanted for AI evaluation research

We are recruiting Housing Law experts to participate in a study of AI answers to landlord-tenant questions. Please sign up here if you are a housing law practitioner interested in this study. Experts who participate in interviews and AI-ranking sessions will receive Amazon gift cards for their participation.
Design Workbook for Legal Help AI Pilots

For our upcoming AI+Access to Justice Summit and our AI for Legal Help class, our team has made a new design workbook to guide people through scoping a new AI pilot. We encourage others to use and explore this AI Design Workbook to help think through: Use Cases and Workflows Specific Legal Tasks that AI could do (or should not do) User Personas, and how they might…
Jurix ’24 AI for Access to Justice Workshop

Building on last year’s very successful academic workshop on AI & Access to Justice at Jurix ’23 in the Netherlands, this year we are pleased to announce a new workshop at Jurix ’24 in Czechia. Margaret Hagan of the Stanford Legal Design Lab is co-leading an academic workshop at the legal technology conference Jurix, on AI for Access to Justice. Quinten Steenhuis from Suffolk LIT Lab and…

Courses on AI & Access to Justice
Our Lab team is teaching interdisciplinary courses at Stanford Law School and design school on how AI can be responsibly built to increase access to justice, or what limits might be put on it to protect people.
Please write to us if you are interested in taking a course, or being a partner on one.

Autumn-Winter 24-25 AI for Legal Help 809E
In Autumn-Winter quarters 2024-25, the Legal Design Lab team offered a new version of its “AI for Legal Help”, focused on hands-on R&D to advance how legal aid and pro bono counsel can serve more people at scale.
It is a 3-credit course, with course code LAW 809E.
Policy Challenge: Can AI increase access to justice, by helping people resolve their legal problems in more accessible, equitable, and effective ways? What are the risks that AI poses for people seeking legal guidance, that technical and policy guardrails should mitigate?
Student Work: In this course, students worked on teams, partnered with frontline legal aid and court groups interested in using AI to co-design new tech pilots to help people dealing with evictions, criminal justice problems, debt collection, and other legal problems.
Using human-centered design, students helped their partners scope out exactly where AI and other interventions might serve both the providers and the clients, what quality benchmarks should guide any new intervention. Then they will work on a demo project, using AI tools and service design, to pilot and evaluate.
Along with their AI pilot, teams established important guidelines to ensure that new AI projects are centered on the needs of people, and developed with a careful eye towards ethical and legal principles.
Read some of the students’ AI proposals and scoped designs:
- AI for Demand Letters, a project for housing legal teams, in partnership with Legal Aid Society of San Bernardino
- Housing Intake and Screening AI, a redesigned workflow using AI to improve the throughput and triage of a legal aid team like LASSB

Autumn-Winter 23-24 AI for Legal Help 809E
In Autumn-Winter quarters 2023-24, the Legal Design Lab team will offer the policy lab class “AI for Legal Help”.
It is a 3-credit course, with course code LAW 809E. We will be working with community groups & justice institutions to interview members of the public about if & how they would use AI platforms (like ChatGPT) to deal with legal problems like evictions, debt collection, or domestic violence.
Our class client is the Legal Services Corporation’s TIG (Technology Initiative Grant) team.
The goal of the class is to develop a community-centered agenda about how to make these AI platforms more effective at helping people with these problems, while also identifying the key risks they pose to people & technical/policy strategies to mitigate these risks.
The class will be taught with user interviews, testing sessions, and multi-stakeholder workshops at the core – to have students synthesize diverse points of view into an agenda that can make AI tools more equitable, accessible, and responsible in the legal domain.

Class Report for AI for Legal Help, Autumn 2023
Read the Autumn Quarter class’ final report on their interview findings.
The students interviewed adults in the US about their possible use of AI to address legal problems. The students’ analysis of the interview results cover the findings and themes that emerged, distinct types of users, needs that future solutions and policies should address, and possible directions to improve AI platforms for people’s legal problem-solving needs.
After holding an interactive workshop with legal and policy stakeholders in December 2023, the class used these experts’ responses to improve their analysis and strengthen their conclusions.
Read the full report from the students here.

Network Events on AI & Access to Justice
Sign up for our AI+A2J Interest List
Are you a justice professional, academic, funder, technologist, or policymaker interested in the future of AI & the justice system? Please fill in the form embedded below (or at this link) to stay in touch. We’ll notify you about events, publications, opportunities, and more.
Research x Practice seminars
Join us for our online, public seminars every first Friday of the month, where researchers & practitioners can present their work in building and evaluating new AI efforts for access to justice.
AI+A2J Events
The Stanford Legal Design Lab convenes summits, seminars, presentations, and workshops among key stakeholders who can design and develop new AI efforts to help people with legal problems: legal aid lawyers, court staff, judges, computer science researchers, tech developers, and community members.
Our team also presents at leading legal and technology conferences to share out our work and findings with a diverse audience.

AI + Access to Justice Summit
The Legal Design Lab is hosting the first national convening of stakeholders working on building, piloting, evaluating, and supporting new AI initiatives to advance access to justice.
The AI+A2J Summit, on October 17-18, 2024 at Stanford Law School, will be the first of an annual series of events to bring frontline legal help providers, technologists, strategists, philanthropists, pro bono leaders, and regulators together to build a more coordinated, lively, responsible network of people working on AI for justice.

Jurix 2024: AI for Access to Justice Workshop
Margaret Hagan of the Stanford Legal Design Lab is co-leading an academic workshop at the legal technology conference Jurix, on AI for Access to Justice. Quinten Steenhuis from Suffolk LIT Lab and Hannes Westermann of Maastricht University Faculty of Law will co-lead the workshop.
We invite legal technologists, researchers, and practitioners to join us in Brno, Czechia on December 11th for a full-day, hybrid workshop on innovations in AI for helping close the access to justice gap: the majority of legal problems that go unsolved around the world because potential litigants lack the time, money, or ability to participate in court processes to solve their problems.
See our workshop homepage here for more details on participation.

Generative AI & Justice workshop at LSC-ITC
Our Lab team collaborated with other AI-justice researchers to run a large workshop on how to use generative AI to increase access to justice at the Legal Services Corporation’s Innovations in Tech Conference.

JURIX ’23 AI & Access to Justice academic workshop
In December 2023, our Lab team is co-hosting an academic workshop on AI & Access to Justice at the JURIX Conference on Legal Knowledge and Information Systems.
There is an open call for submissions to the workshop. Submissions are due by November 12, 2023. We encourage academics, practitioners, and others interested in the field to submit a paper for the workshop or consider attending.

AI + A2J User Research Workshop
In Autumn 2023, our AI for Legal Help class hosted an interactive workshop with representatives from technology companies, bar associations, universities, courts, legal aid groups, and tenants unions. The students presented their preliminary findings of their user research, received feedback from these various stakeholders, and then brainstormed in breakout groups about how to move forward with the user research findings, to have a better future for AI & Access to Justice.

AI & Legal Help Crossover Workshop
In Summer 2023, an interdisciplinary group of researchers at Stanford hosted the “AI and Legal Help Crossover” event, for stakeholders from the civil justice system and computer science to meet, talk, and identify promising next steps to advance the responsible development of AI for improving the justice system.
Stanford-SRLN Spring 2023 brainstorm session
In Spring 2023, the Stanford Legal Design Lab collaborated with the Self Represented Litigation Network to organize a stakeholder session on artificial intelligence (AI) and legal help within the justice system. We conducted a one-hour online session with justice system professionals from various backgrounds, including court staff, legal aid lawyers, civic technologists, government employees, and academics. The purpose of the session was to gather insights into how AI is already being used in the civil justice system, identify opportunities for improvement, and highlight potential risks and harms that need to be addressed. We documented the discussion with a digital whiteboard.
Read more about the session & the brainstorm of opportunities and risks.

Research on AI & Access to Justice
The Stanford Legal Design Lab has been researching what community members want from AI for justice problems, how AI systems perform on justice-related queries, and what opportunities there are to increase the quality of AI in helping people with their justice problems.
Towards Human-Centered Standards for Legal Help AI
Margaret D. Hagan, “Towards Human-Centered Standards for Legal Help AI.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. February 2024. Theme issue: ‘A complexity science approach to law and governance’. Ed. Daniel M. Katz, J. B. Ruhl and Pierpaolo Vivo. Available at https://royalsocietypublishing.org/doi/10.1098/rsta.2023.0157
As more groups consider how AI may be used in the legal sector, this paper envisions how companies and policymakers can prioritize the perspective of community members as they design AI and policies around it. It presents findings of structured interviews and design sessions with community members, in which they were asked about whether, how, and why they would use AI tools powered by large language models to respond to legal problems like receiving an eviction notice. The respondents reviewed options for simple versus complex interfaces for AI tools, and expressed how they would want to engage with an AI tool to resolve a legal problem. These empirical findings provide directions that can counterbalance legal domain experts’ proposals about the public interest around AI, as expressed by attorneys, court officials, advocates, and regulators. By hearing directly from community members about how they want to use AI for civil justice tasks, what risks concern them, and the value they would find in different kinds of AI tools, this research can ensure that people’s points of view are understood and prioritized, rather than only domain experts’ assertions about people’s needs and preferences around legal help AI.

Measuring What Matters
Hagan, Margaret, Measuring What Matters: Developing Human-Centered Legal Q-and-A Quality Standards through Multi-Stakeholder Research (December 01, 2024). Available at SSRN: https://ssrn.com/abstract=5146722
Abstract:
As more people use AI for high-stakes legal tasks like landlord-tenant problems, a large challenge looms: there are no established, clear quality standards for answers to a legal question. How do technologists, regulators, providers, or users know if AI is performing well at this increasingly common, high-stakes task of answering a person’s legal question? This paper offers a multi-stakeholder, mixed-methods approach to involve both users and experts in developing concrete, real-world quality standards for evaluating AI’s performance at the Legal Q-and-A task. It presents the results of a 3-study sequence to identify which specific quality criteria matter most for helping a user seeking legal help, and concludes with a draft quality evaluation protocol. These results can help those working on AI used for consumers’ legal Q-and-A to better improve their systems, and it can assist regulators and researchers interested in evaluating AI’s possible consumer harms.
Good AI Legal Help, Bad AI Legal Help
Margaret D. Hagan. (2023). Good AI Legal Help, Bad AI Legal Help: Establishing quality standards for responses to people’s legal problem stories. In JURIX AI and Access to Justice Workshop. Retrieved from https://drive.google.com/file/d/14CitzBksHiu_2x8W2eT-vBe_JNYOcOhM/view?usp=drive_link
Abstract:
Much has been made of generative AI models’ ability to perform legal tasks or pass legal exams, but a more important question for public policy is whether AI platforms can help the millions of people who are in need of legal help around their housing, family, domestic violence, debt, criminal records, and other important problems. When a person comes to a well-known, general generative AI platform to ask about their legal problem, what is the quality of the platform’s response? Measuring quality is difficult in the legal domain, because there are few standardized sets of rubrics to judge things like the quality of a professional’s response to a person’s request for advice. This study presents a proposed set of 22 specific criteria to evaluate the quality of a system’s answers to a person’s request for legal help for a civil justice problem. It also presents the review of these evaluation criteria by legal domain experts like legal aid lawyers, courthouse self help center staff, and legal help website administrators. The result is a set of standards, context, and proposals that technologists and policymakers can use to evaluate quality of this specific legal help task in future benchmark efforts.
LegalBench: A collaboratively built benchmark
Guha, Neel, Julian Nyarko, Daniel Ho, Christopher Ré, Adam Chilton, Alex Chohlas-Wood, Austin Peters, Margaret Hagan, et al. “Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models.” Advances in Neural Information Processing Systems 36 (2024). Available at https://arxiv.org/pdf/2308.11462.pdf
The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning — which distinguish between its many forms — correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.
Opportunities & Risks for AI, Legal Help, and Access to Justice
Margaret D. Hagan (2023, June). Opportunities & Risks for AI, Legal Help, and Access to Justice. Legal Design and Innovation. Retrieved from https://medium.com/legal-design-and-innovation/opportunities-risks-for-ai-legal-help-and-access-to-justice-9c2faf8be393
Intro:
As more lawyers, court staff, and justice system professionals learn about the new wave of generative AI, there’s increasing discussion about how AI models & applications might help close the justice gap for people struggling with legal problems.
Could AI tools like ChatGPT, Bing Chat, and Google Bard help get more people crucial information about their rights & the law?
Could AI tools help people efficiently and affordably defend themselves against eviction or debt collection lawsuits? Could it help them fill in paperwork, create strong pleadings, prepare for court hearings, or negotiate good resolutions?
This report presents the initial proposals of the tasks, scenarios & use cases where AI could be helpful.
It also covers risks, harms, and worries brought up.
Finally, it lays out some key infrastructure proposals.
Evaluating the Quality of AI in the Legal Domain
Bommarito, M. J., & Katz, D. M. (2023). GPT Takes the Bar Exam. SSRN Electronic Journal, 1–7. https://doi.org/10.2139/ssrn.4314839
Choi, J. H., Hickman, K. E., Monahan, A., & Schwarcz, D. B. (2023). ChatGPT Goes to Law School. SSRN Electronic Journal, 1–16. https://doi.org/10.2139/ssrn.4335905
Dahl, M., Magesh, V., Suzgun, M., & Ho, D. E. (2024). Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models. Retrieved from http://arxiv.org/abs/2401.01301
Deroy, A., Ghosh, K., & Ghosh, S. (2023). How Ready are Pre-trained Abstractive Models and LLMs for Legal Case Judgement Summarization? CEUR Workshop Proceedings, 3423, 8–19. Retrieved from https://arxiv.org/abs/2306.01248
Fei, Z., Shen, X., Zhu, D., Zhou, F., Han, Z., Zhang, S., … Ge, J. (2023). LawBench: Benchmarking Legal Knowledge of Large Language Models, 1–38. Retrieved from http://arxiv.org/abs/2309.16289
Guha, N., Ho, D. E., Nyarko, J., & Re, C. (2022). LegalBench: Prototyping a Collaborative Benchmark for Legal Reasoning. Stanford, CA. Retrieved from https://arxiv.org/abs/2209.06120
Guha, N., Nyarko, J., Ho, D. E., Ré, C., Chilton, A., Narayana, A., … Li, Z. (2023). LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models, 1–143. https://doi.org/10.2139/ssrn.4583531
Harden, S. (2023). The Results: Rating Generative AI Responses to Legal Questions. Retrieved November 6, 2023, from https://samharden.substack.com/p/the-results-rating-generative-ai?r=3c0pj&utm_campaign=post&utm_medium=web
Henderson, P., Krass, M. S., Zheng, L., Manning, N. G. C. D., Jurafsky, D., & Ho, D. E. (2022). Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset. Advances in Neural Information Processing Systems, 35(NeurIPS). https://arxiv.org/abs/2207.00220
Katz, D. M., Bommarito, M. J., Gao, S., & Arredondo, P. (2023). GPT-4 Passes the Bar Exam. SSRN Electronic Journal, 1–35. https://doi.org/10.2139/ssrn.4389233
Savelka, J., Ashley, K. D., Gray, M. A., Westermann, H., & Xu, H. (2023). Explaining Legal Concepts with Augmented Large Language Models (GPT-4). Retrieved from http://arxiv.org/abs/2306.09525
Tan, J., Westermann, H., & Benyekhlef, K. (2023). ChatGPT as an Artificial Lawyer? In CEUR Workshop Proceedings (Vol. 3435). Retrieved from https://ceur-ws.org/Vol-3435/short2.pdf
AI-A2J Opportunities, Concerns, and Behavior
Stanford Policy Lab 809E Autumn Quarter. (2023). The Use and Application of Generative AI for Legal Assistance. Retrieved from https://docs.google.com/document/d/1bx_HXOMrgPjGjVDR21Vfekuk8_4ny0878430JKZ9-Sk/edit?usp=sharing
Hagan, M. D. (2023, June). Opportunities & Risks for AI, Legal Help, and Access to Justice. Legal Design and Innovation. Retrieved from https://medium.com/legal-design-and-innovation/opportunities-risks-for-ai-legal-help-and-access-to-justice-9c2faf8be393
Hagan, M. D. Towards Human-Centered Standards for Legal Help AI. Philosophical Transactions of the Royal Society A. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4582745
American Bar Association Standing Committee on Ethics and Professional. (2024). Formal Opinion 512: Generative Aritifical Intelligence Tools. Chicago, Illinois. https://www.americanbar.org/content/dam/aba/administrative/professional_responsibility/ethics-opinions/aba-formal-opinion-512.pdf
Rohan Bhambhoria et al., Evaluating AI for Law: Bridging the Gap with Open-Source Solutions (2024), http://arxiv.org/abs/2404.12349
Inyoung Cheong et al., (A)I Am Not a Lawyer, But. . . : Engaging Legal Experts towards Responsible LLM Policies for Legal Advice, (2024), https://arxiv.org/abs/2402.01864v1
Chien, C., Kim, M., Raj, A., & Rathish, R. (2024). How LLMs Can Help Address the Access to Justice Gap through the Courts. Berkeley, CA. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4683309
Colleen Chien & Miriam Kim, Generative AI and Legal Aid: Results from a Field Study and 100 Use Cases to Bridge the Access to Justice Gap, UC Berkeley Public Law Leg. Theory Ser. 1 (2024), https://doi.org/10.1787/c2c1d276-en
Matthew Dahl et al., Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models (2024), http://arxiv.org/abs/2401.01301
Granat, R. (2023). ChatGPT, Access to Justice, and UPL. Retrieved June 19, 2023, from https://www.lawproductmakers.com/2023/03/chatgtp-access-to-justice-and-upl/
Guzman, H. (2023). AI’s “Hallucinations” Add to Risks of Widespread Adoption. Retrieved June 19, 2023, from https://www.law.com/corpcounsel/2023/03/23/ais-hallucinations-add-to-risks-of-widespread-adoption/?slreturn=20230519164801
Holt, A. T. (2023). Legal AI-d to Your Service: Making Access to Justice a Reality. Vanderbilt Journal of Entertainment and Technology Law. Retrieved from https://www.vanderbilt.edu/jetlaw/2023/02/04/legal-ai-d-to-your-service-making-access-to-justice-a-reality/
Sayash Kapoor, Peter Henderson & Arvind Narayanan, Promises and Pitfalls of Artificial Intelligence for Legal Applications, SSRN Electron. J. 1 (2024), https://www.semanticscholar.org/reader/c9628559d2a7fff72fd1f34b925d7a5864d92aea
Kanu, H. (2023, April). Artificial intelligence poised to hinder, not help, access to justice. Reuters. Retrieved from https://www.reuters.com/legal/transactional/artificial-intelligence-poised-hinder-not-help-access-justice-2023-04-25/
Daniel W. Linna & Wendy Muchman, Ethical Obligations to Protect Client Data when Building Artificial Intelligence Tools: Wigmore Meets AI, The Professional Lawyer, 2020, https://www.americanbar.org/groups/professional_responsibility/publications/professional_lawyer/27/1/ethical-obligations-protect-client-data-when-building-artificial-intelligence-tools-wigmore-meets-ai/
Varun Magesh et al., Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools, 1 (2024), http://arxiv.org/abs/2405.20362
Pacheco, S. (2023, March). DoNotPay Lawsuits: A Setback for Justice Initiatives? Bloomberg Law. Retrieved from https://news.bloomberglaw.com/bloomberg-law-analysis/analysis-donotpay-lawsuits-a-setback-for-justice-initiatives
Perlman, A. (2023). The Implications of ChatGPT for Legal Services and Society. The Practice. Cambridge, MA. Retrieved from https://clp.law.harvard.edu/knowledge-hub/magazine/issues/generative-ai-in-the-legal-profession/the-implications-of-chatgpt-for-legal-services-and-society/
Francine Ryan & Liz Hardie, ChatGPT, I have a Legal Question? The Impact of Generative AI Tools on Law Clinics and Access to Justice, 31 Int. J. Clin. Leg. Educ. 166 (2024), https://www.northumbriajournals.co.uk/index.php/ijcle/article/view/1401/1789
Poppe, E. T. (2019). The Future Is ̶B̶r̶i̶g̶h̶t̶ Complicated: AI, Apps & Access to Justice. Oklahoma Law Review, 72(1). Retrieved from https://digitalcommons.law.ou.edu/olr/vol72/iss1/8
Jaromir Savelka et al., Explaining Legal Concepts with Augmented Large Language Models (GPT-4) (2023), http://arxiv.org/abs/2306.09525
Simshaw, D. (2022). Access to A.I. Justice: Avoiding an Inequitable Two-Tiered System of Legal Services. Yale Journal of Law & Technology, 24, 150–226. https://yjolt.org/access-ai-justice-avoiding-inequitable-two-tiered-system-legal-services
Stepka, M. (2022, February). Law Bots: How AI Is Reshaping the Legal Profession. ABA Business Law Today. Retrieved from https://businesslawtoday.org/2022/02/how-ai-is-reshaping-legal-profession/
Telang, A. (2023). The Promise and Peril of AI Legal Services to Equalize Justice. Harvard Journal of Law & Technology. Retrieved from https://jolt.law.harvard.edu/digest/the-promise-and-peril-of-ai-legal-services-to-equalize-justice
Tripp, A., Chavan, A., & Pyle, J. (2018). Case Studies for Legal Services Community Principles and Guidelines for Due Process and Ethics in the Age of AI. Retrieved from https://docs.google.com/document/d/1rEvg5xuOs_o1njPHHpF9jtuaGi0ren6DYUElBu0Fkfk/edit
Verma, P., & Oremus, W. (2023, November). These lawyers used ChatGPT to save time. They got fired and fined. The Washington Post. Retrieved from https://www.washingtonpost.com/technology/2023/11/16/chatgpt-lawyer-fired-ai/
Westermann, Hannes and Karim Benyekhlef. JusticeBot: A Methodology for Building Augmented Intelligence Tools for Laypeople to Increase Access to Justice. ICAIL 2023. https://arxiv.org/pdf/2308.02032.pdf , https://www.cyberjustice.ca/en/logiciels-cyberjustice/nos-solutions-logicielles/justicebot/
Wilkins, S. (2023, February). DoNotPay’s Downfall Put a Harsh Spotlight on AI and Justice Tech. Now What? Legaltech News. Retrieved from https://www.law.com/legaltechnews/2023/02/10/donotpays-downfall-put-a-harsh-spotlight-on-ai-and-justice-tech-now-what/
Evaluating AI’s Performance & Harms, beyond law
Agrawal, A., Suzgun, M., Mackey, L., & Kalai, A. T. (2023). Do Language Models Know When They’re Hallucinating References?, 1–18. Retrieved from http://arxiv.org/abs/2305.18248
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? FAccT 2021 – Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922
Bickmore, T. W., Trinh, H., Olafsson, S., O’Leary, T. K., Asadi, R., Rickles, N. M., & Cruz, R. (2018). Patient and consumer safety risks when using conversational assistants for medical information: An observational study of siri, alexa, and google assistant. Journal of Medical Internet Research, 20(9). https://doi.org/10.2196/11510
Bommasani, R., Liang, P., & Lee, T. (2023). Holistic Evaluation of Language Models. Stanford, CA. https://doi.org/10.1111/nyas.15007
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., … Liang, P. (2021). On the Opportunities and Risks of Foundation Models, 1–214. Retrieved from http://arxiv.org/abs/2108.07258
Jones, E., & Steinhardt, J. (2022). Capturing Failures of Large Language Models via Human Cognitive Biases. Advances in Neural Information Processing Systems, 35(NeurIPS), 1–22.-Shuster, K., Poff, S., Chen, M., Kiela, D., & Weston, J. (2021). Retrieval Augmentation Reduces Hallucination in Conversation. Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021, 3784–3803. https://doi.org/10.18653/v1/2021.findings-emnlp.320
Kadavath, S., Conerly, T., Askell, A., Henighan, T., Drain, D., Perez, E., … Kaplan, J. (2022). Language Models (Mostly) Know What They Know. Retrieved from http://arxiv.org/abs/2207.05221
Mündler, N., He, J., Jenko, S., & Vechev, M. (2023). Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, 1–26. Retrieved from http://arxiv.org/abs/2305.15852
Nakao, Y., Strappelli, L., Stumpf, S., Naseer, A., Regoli, D., & Gamba, G. Del. (2023). Towards Responsible AI: A Design Space Exploration of Human-Centered Artificial Intelligence User Interfaces to Investigate Fairness. International Journal of Human-Computer Interaction, 39(9), 1762–1788. https://doi.org/10.1080/10447318.2022.2067936
Peng, B., Galley, M., He, P., Cheng, H., Xie, Y., Hu, Y., … Gao, J. (2023). Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. Retrieved from http://arxiv.org/abs/2302.12813
Tian, K., Mitchell, E., Yao, H., Manning, C. D., & Finn, C. (2023). Fine-tuning Language Models for Factuality, 1–16. Retrieved from http://arxiv.org/abs/2311.08401
Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P. Sen, Mellor, J., … Gabriel, I. (2022). Taxonomy of Risks posed by Language Models. In ACM International Conference Proceeding Series (Vol. 22, pp. 214–229). ACM. https://doi.org/10.1145/3531146.3533088
Strategies to Improve Safe Use of AI
Argo, J. J., & Main, K. J. (2004). Meta-analyses of the effectiveness of warning labels. Journal of Public Policy and Marketing, 23(2), 193–208. https://doi.org/10.1509/jppm.23.2.193.51400
Ayres, I., & Schwartz, A. (2014). The no-reading problem in consumer contract law. Stanford Law Review, 66(3), 545–610. https://www.stanfordlawreview.org/wp-content/uploads/sites/3/2014/03/66_Stan_L_Rev_545_AyresSchwartz.pdf
Ben-Shahar, O., & Chilton, A. (2016). Simplification of privacy disclosures: An experimental test. Journal of Legal Studies, 45(S2), S41–S67. https://doi.org/10.1086/688405
Calo, M. R. (2013). Against Notice Skepticism in Privacy (and Elsewhere). Notre Dame L. Rev, 87(1027). Retrieved from http://scholarship.law.nd.edu/ndlr%5Cnhttp://scholarship.law.nd.edu/ndlr/vol87/iss3/3
Kelley, P. G., Bresee, J., Cranor, L. F., & Reeder, R. W. (2009). A “nutrition label” for privacy. In Proceedings of the 5th Symposium on Usable Privacy and Security – SOUPS ’09 (p. 1). https://doi.org/10.1145/1572532.1572538
Hagan, M. (2016). Designing 21st-Century Disclosures for Financial Decision Making. Stanford, CA. Retrieved from https://law.stanford.edu/publications/designing-21st-century-disclosures-for-financial-decision-making/
Martel, C., & Rand, D. G. (2023). Misinformation warning labels are widely effective: A review of warning effects and their moderating features. Current Opinion in Psychology, 54, 101710. https://doi.org/10.1016/j.copsyc.2023.101710
Robinson, L. A., Viscusi, W. K., & Zeckhauser, R. (2016, November). Consumer Warning Labels Aren’t Working. Harvard Business Review. Retrieved from https://hbr.org/2016/11/consumer-warning-labels-arent-working
Schaub, F., Balebako, R., Durity, A. L., & Cranor, L. F. (2018). A Design Space for Effective Privacy Notices. In The Cambridge Handbook of Consumer Privacy (pp. 365–393). https://doi.org/10.1017/9781316831960.021









