We are happy to announce the launch of our fourth round of the class “AI for Legal Help”. It is cross-listed at Stanford Law School and Design School.
Students will be working with real-world, public interest legal groups to develop AI solutions in a responsible, practical way — that can help scale out high-need legal services.
Here is the class description:
Want to build AI that actually matters? AI for Legal Help is a two-quarter, hands-on course where law, design, computer science, and policy students team up with legal aid organizations and court self-help centers to take on one of the biggest challenges in tech today: using AI to expand access to justice.
You’ll work directly with real-world partners to uncover where AI could make legal services faster, more scalable, and more effective—while ensuring it’s safe, ethical, and grounded in the realities of public service. From mapping workflows to spotting opportunities, from creating benchmarks and datasets to designing AI “co-pilots” or system proposals, you’ll help shape the future of AI in the justice system.
Along the way, you’ll learn how to evaluate whether AI is the right fit for a task, design human–AI teams that work, build privacy-forward and trustworthy systems, and navigate the policy and change-management challenges of introducing AI into high-stakes environments.
By the end, your team will have produced a substantial, real-world deliverable—such as a UX research report, benchmark dataset, evaluation rubric, system design proposal, or prototype concept—giving you practical experience in public interest technology, AI system design, and leadership engagement. This is your chance to create AI that works for people, in practice, where it’s needed most.
What Legal Help Actually Requires: Building a Task Taxonomy for AI, Research, and Access to Justice
In December 2025, I presented a new piece of research at the JURIX Conference in Turin, Italy, as part of the workshop on AI, Dispute Resolution, and Access to Justice. The workshop brought together legal scholars, technologists, and practitioners from around the world to examine how artificial intelligence is already shaping legal systems—and how it should shape them in the future.
My paper focuses on a deceptively simple question: What do legal help teams and consumers actually do when trying to resolve legal problems?
This question sits at the heart of access to justice. Around the world, billions of people face legal problems without sufficient help. Courts, legal aid organizations, and community groups work tirelessly to close this gap—but the work itself is often invisible, fragmented, and poorly documented. At the same time, AI tools are rapidly being developed for legal use, often without a clear understanding of the real tasks they are meant to support.
The work I presented in Turin proposes a way forward: a Legal Help Task Taxonomy—a structured, shared framework that defines the core tasks involved in legal help delivery, across jurisdictions, problem types, and service models. (See a first version here at the JusticeBench site or at our Airtable version.)
This blog post explains why that taxonomy matters, how it was developed, and what we discussed at JURIX about making it usable and impactful—not just theoretically elegant.
Why a Task Taxonomy for Legal Help?
Legal help work is often described in broad strokes: “legal advice,” “representation,” “self-help,” or “court assistance.” But these labels obscure what actually happens on the ground.
In reality, legal help consists of dozens of discrete tasks:
identifying what legal issue is present in a messy life situation,
explaining a confusing notice or summons,
calculating deadlines,
selecting the correct form,
helping someone tell their story clearly,
preparing evidence,
filing documents,
following up to ensure nothing is missed.
Some of these tasks are done by lawyers, others by navigators, librarians, court staff, or volunteers. Many are done partly by consumers themselves. Some are repetitive and high-volume; others are complex and high-risk.
Despite this, there has never been a shared, cross-jurisdictional vocabulary for describing these tasks. This absence makes it harder to:
study what legal help systems actually do,
design technology that fits real workflows,
evaluate AI tools responsibly,
or collaborate across organizations and states.
Without task-level clarity, we end up talking past each other—using the same words to mean very different things.
How the Task Taxonomy Emerged
The Legal Help Task Taxonomy did not start as a top-down academic exercise. It emerged organically over several years of applied work with:
legal aid organizations,
court self-help centers,
statewide legal help websites,
pro bono clinics,
and national access-to-justice networks.
As teams tried to build AI tools, improve workflows, and evaluate outcomes, the same problem kept arising: we couldn’t clearly articulate what task a tool was actually performing.
Was a chatbot answering questions—or triaging users? Was a form tool drafting documents—or just collecting data? Was an AI system explaining a notice—or giving legal advice?
To address this, we began mapping tasks explicitly, using practitioner workshops, brainstorming sessions, and analysis of real workflows. Over time, patterns emerged across jurisdictions and issue areas.
The result is a taxonomy organized into seven categories of tasks, spanning the full justice journey:
Getting Brief Help (e.g., legal Q&A, document explanation, issue-spotting)
Providing Brief Help (e.g., guide writing, content review, translation)
Service Onboarding (e.g., intake, eligibility verification, conflicts checks)
Work Product (e.g., form filling, narrative drafting, evidence preparation)
Case Management (e.g., scheduling, reminders, filing screening)
Administration & Strategy (e.g., data extraction, grant reporting)
Tech Tooling (e.g., form creation, interview design, user testing)
Each task is defined in plain language, with clear boundaries. The taxonomy is intentionally general—not tied to one legal issue or country—so that teams can collaborate on shared solutions.
What We Discussed at JURIX
Presenting this work at JURIX was particularly meaningful because the audience sits at the intersection of law, AI, and knowledge representation. The discussions went beyond whether a taxonomy is useful (there was broad agreement that it is) and focused instead on how to make it actionable.
Three themes stood out.
1. Tasks as the Right Unit for AI Evaluation
One of the most productive conversations was about evaluation. Rather than asking whether an AI system is “good at legal help,” the taxonomy allows us to ask more precise questions:
Can this system accurately explain documents?
Can it safely calculate deadlines?
Can it help draft narratives without hallucinating facts?
This task-based framing makes it possible to benchmark AI systems honestly—recognizing that some tasks (like rewriting text) may be feasible with general-purpose models, while others (like eligibility determination or deadline calculation) require grounded, jurisdiction-specific data.
2. Usability Matters More Than Completeness
Another theme was usability. A taxonomy that is theoretically comprehensive but practically overwhelming will not be adopted.
At the workshop, we discussed:
staging tasks for review in manageable sections,
writing definitions in practitioner language,
allowing feedback and iteration,
and supporting partial adoption (teams don’t need to use every task at once).
The goal is not to impose a rigid structure, but to create a living, testable framework that practitioners recognize as reflecting their real work.
3. Interoperability and Shared Infrastructure
Finally, we discussed how a task taxonomy can serve as connective tissue between other standards—such as legal issue taxonomies, document schemas, and service directories.
By aligning tasks with standards like LIST, Akoma Ntoso, and SALI, the taxonomy can support interoperability across tools and datasets. This is especially important for AI development: shared task definitions make it easier to reuse data, compare results, and avoid duplicating effort.
What Comes Next
The taxonomy presented at JURIX is not the final word. It is a proposal—one that is now moving toward publication and broader validation.
Next steps include:
structured review by legal help professionals,
refinement based on feedback,
use in AI evaluation benchmarks,
and integration into JusticeBench as a shared research resource.
Ultimately, the aim is simple but ambitious: to make legal help work visible, describable, and improvable.
If we want AI to genuinely advance access to justice—rather than add confusion or risk—we need to start by naming the work it is meant to support. This task taxonomy is one step toward that clarity.
The Stanford Legal Design Lab hosted its second annual AI & Access to Justice Summit as a gathering for leaders from legal aid organizations, technology companies, academia, philanthropists, and private practice. This diverse assembly of professionals gathered to discuss the potential of generative AI, and — most crucially at this moment of Autumn 2025 — to strategize about how to make AI work at scale to address the justice gap.
The summit’s mission was clear: to move beyond the hype cycle and forge a concrete path forward for a sustainable AI & A2J ecosystem across the US and beyond. The central question posed was how the legal community could work as an ecosystem to harness this technology, setting an agenda for 2, 5, and 10-year horizons to create applications, infrastructure, and new service/business models that can get more people access to justice.
The Arc of the Summit
The summit was structured over 2 days to help the diverse participants learn about AI tools, pilots, case studies, and lessons learned for legal teams — and then giving the participants the opportunity to design new interventions and strategies for a stronger AI R&D ecosystem.
Day 1 was dedicated to learning and inspiration, featuring a comprehensive slate of speakers who presented hands-on demonstrations of cutting-edge AI tools, shared detailed case studies of successful pilots, and offered insights from the front lines of legal tech innovation.
Day 1 -> Day 2’s mission
Day 2 was designed to shift the focus from listening to doing, challenging attendees to synthesize the previous day’s knowledge into strategic designs, collaborative agendas, and new partnerships. This structure was designed to build a shared foundation of knowledge before embarking on the collaborative work of building the future.
The Summit began by equipping attendees with a new arsenal of technological capabilities, showcasing the tools that serve as the building blocks for this new era in justice.
Our Key AI + A2J Ecosystem Moment
The key theme of this year’s AI+A2J Summit is building a strong, coordinated R&D ecosystem. This is because our community of legal help providers, researchers, public interest tech-builders, and strategists are at a key moment.
It’s been over 3 years now since the launch of ChatGPT. Where are we going with AI in access to justice?
We are several years into the LLM era now — past the first wave of surprise, demos, and hype — and into the phase where real institutions are deciding what to do with these tools. People are already using AI to solve problems in their everyday lives, including legal problems, whether courts and legal aid organizations are ready or not. That means the “AI moment” is no longer hypothetical: it’s shaping expectations, workflows, and trust right now. But still many justice leaders are confused, overwhelmed, or unsure about how to get to positive impact in this new AI era.
Leaders are not sure how to make progress.
This is exactly why an AI+A2J Summit like this matters. We’re at a pivot point where the field can either coordinate and build durable public-interest infrastructure — or fragment into disconnected experiments that don’t translate into meaningful service capacity. A2J leaders are balancing urgency with caution, and the choices made in the next year or two will set patterns that could last a decade: what gets adopted, what gets regulated, what gets trusted, and what gets abandoned.
What will 2030 look like for A2J?
We have possible rosy futures and we have more devastating ones.
Which of these possible near futures will we have in 2030 for access to justice?
A robust, accessible marketplace of services — where everyone having a problem with their landlord, debt collector, spouse, employer, neighbor, or government can easily get the help they need in the form they want?
Or will we have a hugely underserved public, that’s frustrated and angry, facing an ever-growing asymmetry of robo-filed lawsuits and relying on low-quality AI help?
What is stopping great innovation imapact?
Some of the key things that could stop our community from delivering great outcomes in the next five years include a few big trends:
too much chilling regulation,
under-performing and -safety tested solutions that lead to bad harms and headlines,
not enough money flowing to get to solutions, everyone reinventing the wheel on their own and deliverting fragile and costly local solutions, and
a lack of a building substantive, meaninful solutions — instead focusing on small, peripheral tasks.
The primary barriers are not just technical — they’re operational, institutional, and human. Legal organizations need tools that are reliable enough to use with real people, real deadlines, and real consequences. But today, many pilots struggle with consistency, integration into daily workflows, and the basic “plumbing” that makes technology usable at scale: identity management, knowledge management, access controls, and clear accountability when something goes wrong.
Trust is also fragile in high-stakes settings, and the cost of a failure is unusually high. A single under-tested tool can create public harm, undermine confidence internally, and trigger an overcorrection that chills innovation. In parallel, many organizations are already stretched thin and running on complex legacy systems. Without shared standards, shared evaluation, and shared implementation support, the burden of “doing AI responsibly” becomes too heavy for individual teams to carry alone.
At the Summit, we worked on 3 different strategy levels to try to prevent these blocks from pushing us to low impact or a continued status quo.
3 Levels of Strategic Work to Set us towards a Good Ecosystem
The goal of the Summit was to get leaders from across the A2J world to clearly define 3 levels of strategy. That means going beyond the usual strategic track — which is just defining the policies and tech agenda for their internal organization.
This meant focusing on both project mode (what are cool ideas and use cases) and also strategy mode — so we can shape where this goes, rather than react to whatever the market and technology delivers. We’re convening people who are already experimenting with AI in courts, legal aid, libraries, and community justice organizations, and we’re asking them to step back and make intentional choices about what they will build, buy, govern, and measure over the next 12–24 months. The point is to move from isolated pilots to durable capacity: tools that can be trusted, maintained, and integrated into real workflows, with clear guardrails for privacy, security, and quality.
To do that, the Summit is designed to push work at three linked levels of strategy.
The 3 levels of straegy
Strategy Level 1: Internal Org Strategy around AI
First is internal, organizational strategy: what each institution needs to do internally — data governance, procurement standards, evaluation protocols, staff training, change management, and the operational “plumbing” that makes AI usable and safe.
Strategy 2: Ecosystem Strategy
Second is ecosystem strategy, that covers how different A2J organizations can collaborate to increase capacity and impact.
Thinking through an Ecosystem approach to share capacity and improve outcomes
This can scope out what we should build together — shared playbooks, common evaluation and certification approaches, interoperable data and knowledge standards, and shared infrastructure that prevents every jurisdiction from reinventing fragile, costly solutions.
Strategy 3: Towards Big Tech & A2J
Third is strategy vis-à-vis big tech: how the justice community can engage major AI platform providers with clear expectations and leverage — so the next wave of product decisions, safety defaults, partnerships, and pricing structures actually support access to justice rather than widen gaps.
As more people and providers go to Big Tech for their answers and development work, how do we get to better A2J impact and outcomes?
The Summit is ultimately about making a coordinated, public-interest plan now — so that by 2030 we have a legal help ecosystem that is more trustworthy, more usable, more interoperable, and able to serve far more people with far less friction.
The Modern A2J Toolbox: A Growing set of AI-Powered Solutions
Equipping justice professionals with the right technology is a cornerstone of modernizing access to justice. The Summit provided a tour of AI tools available to the community, ranging from comprehensive legal platforms designed for large-scale litigation to custom-built solutions tailored for specific legal aid workflows. This tour of the growing AI toolbox revealed an expanding arsenal of capabilities designed to augment legal work, streamline processes, and extend the reach of legal services.
Research & Case Management Assistants
Teams from many different AI and legal tech teams presented their solutions and explained how they can be used to expand access to justice.
Notebook LM: The Notebook LM tool from Google empowers users to create intelligent digital notebooks from their case files and documents. Its capabilities have been significantly enhanced, featuring an expanded context window of up to 1 million tokens, allowing it to digest and analyze vast amounts of information. The platform is fully multilingual, supporting over 100 languages for both queries and content generation. This enables it to generate a wide range of work products, from infographics and slide decks to narrated video overviews, making it a versatile tool for both internal analysis and client communication.
Harvey:Harvey is an AI platform built specifically for legal professionals, structured around three core components. The Assistant functions as a conversational interface for asking complex legal questions based on uploaded files and integrated research sources like LexisNexis. The Vault serves as a secure repository for case documents, enabling deep analysis across up to 10,000 different documents at once. Finally, Workflows provide one-click solutions for common, repeatable tasks like building case timelines or translating documents, with the ability for organizations to create and embed their own custom playbooks.
Thomson Reuters’ CoCounsel: CoCounsel is designed to leverage an organization’s complete universe of information — from its own internal data and knowledge management systems to the primary law available through Westlaw. This comprehensive integration allows it to automate and assist with tasks across the entire client representation lifecycle, from initial intake and case assessment to legal research and discovery preparation. The platform is built to function like a human colleague, capable of pulling together disparate information sources to efficiently construct the building blocks of legal practice. TR also has an AI for Justice program that leverages CoCounsel and its team to help legal aid organizations.
VLex’s Vincent AI:Vincent AI adopts a workflow-based approach to legal tasks, offering dedicated modules for legal research, contract analysis, complaint review, and large-scale document review. Its design is particularly user-friendly for those with “prompting anxiety,” as it can automatically analyze an uploaded document (such as a lease or complaint) and suggest relevant next steps and analyses. A key feature is its ability to process not just text but also audio and video content, opening up powerful applications for tasks like analyzing client intake calls or video interviews to rapidly identify key issues.
AI on Case Management & E-Discovery Platforms
Legal Server: As a long-standing case management system, Legal Server has introduced an AI assistant named “Ellis.” The platform’s core approach to AI is rooted in data privacy and relevance. Rather than drawing on the open internet, Ellis is trained exclusively on an individual client organization’s own isolated data repository, including its help documentation, case notes, and internal documents. This ensures that answers are grounded in the organization’s specific context and expertise while maintaining strict client confidentiality.
Relativity:Relativity’s e-discovery platform is made available to justice-focused organizations through its “Justice for Change” program. The platform includes powerful generative AI features like AIR for Review, which can analyze hundreds of thousands of documents to identify key people, terms, and events in an investigation. It also features integrated translation tools that support over 100 languages, including right-to-left languages like Hebrew, allowing legal teams to seamlessly work with multilingual case documents within a single, secure environment.
These tools represent a leap in technological capability. They all show the growing ability for AI to help legal teams synthesize info, work with documents, conduct research, produce key work product, and automate workflows. But how do we go from tech tools to real-world impact, solutions that are deployed at scale and get to high performance numbers? The Summit moved from tech demos to case studies to get to accounts of how to get to value and impact.
From Pilots to Impact: AI in Action Across the Justice Sector
In the second half of Day 1, the Summit moved beyond product demonstrations to showcase a series of compelling case studies from across the justice sector. These presentations offered proof points of how organizations are already leveraging AI to serve more people, improve service quality, and create new efficiencies, delivering concrete value to their clients and communities today.
Legal Aid Society of Middle Tennessee & The Cumberlands — Automating Expungement Petitions:The “ExpungeMate” project was created to tackle the manual, time-consuming process of reviewing criminal records and preparing expungement petitions. By building a custom GPT to analyze records and an automated workflow to generate the necessary legal forms, the organization dramatically transformed its expungement clinics. At a single event, their output surged from 70 expungements to 751. This newfound efficiency freed up attorneys to provide holistic advice and enabled a more comprehensive service model that brought judges, district attorneys, and clerks on-site to reinstate driver’s licenses and waive court debt in real-time.
Citizens Advice (UK) — Empowering Advisors with Caddy: Citizens Advice developed Caddy (Citizens Advice Digital Assistant), an internal chatbot designed to support its network of advisors, particularly new trainees. Caddy uses a Retrieval-Augmented Generation (RAG), a method that grounds the AI’s answers in a private, trusted knowledge base to ensure accuracy and prevent hallucination. A key feature is its “human-in-the-loop” workflow, where supervisors can quickly validate answers before they are given to clients. A six-week trial demonstrated significant impact, with the evaluation found that Caddy halved the response time for advisors seeking supervisory support, unlocking capacity to help thousands more people.
Frontline Justice — Supercharging Community Justice Workers To support its network of non-lawyer “justice workers” in Alaska, Frontline Justice deployed an AI tool designed not just as a Q&A bot, but as a peer-to-peer knowledge hub. While the AI provides initial, reliable answers to legal questions, the system empowers senior justice workers to review, edit, and enrich these answers with practical, on-the-ground knowledge like local phone numbers or helpful infographics. This creates a dynamic, collaborative knowledge base where the expertise of one experienced worker in a remote village can be instantly shared with over 200 volunteers across the state.
Lone Star Legal Aid — Building a Secure Chatbot Ecosystem Lone Star Legal Aid embarked on an ambitious in-house project to build three distinct chatbots on a secure RAG architecture to serve different user groups. One internal bot, LSLAsks, is for administrative information in their legal aid group. Their internal bot for legal staff, Juris, was designed to centralize legal knowledge and defeat the administrative burden of research. A core part of their strategy involved rigorous A/B testing of four different search models (cleverly named after the Ninja Turtles) to meticulously measure accuracy, relevancy, and speed, with the ultimate goal of eliminating hallucinations and building user trust in the system.
People’s Law School (British Columbia) — Ensuring Quality in Public-Facing AI The team behind the public-facing Beagle+ chatbot shared their journey of ensuring high-quality, reliable answers for the public. Their development process involved intensive pre- and post-launch evaluation. Before launch, they used a 42-question dataset of real-world legal questions to test different models and prompts until they achieved 99% accuracy. After launch, a team of lawyers reviewed every single one of the first 5,400 conversations to score them for safety and value, using the findings to continuously refine the system and maintain its high standard of quality.
These successful implementations offered more than just inspiration; they surfaced a series of critical strategic debates that the entire access to justice community must now navigate.
Lessons Learned and Practical Strategies from the First Generation of AI+A2J Work
A consistent “lesson learned” from Day 1 was that legal aid AI only works when it’s treated as mission infrastructure, not as a cool add-on. Leaders emphasized values as practical guardrails: put people first (staff + clients), keep the main thing the main thing (serving clients), and plan for the long term — especially because large legal aid organizations are “big ships” that can’t pivot overnight.
Smart choice of projects: In practice, that means choosing projects that reduce friction in frontline work, don’t distract from service delivery, and can be sustained after the initial burst of experimentation.
An ecosystem of specific solutions: On the build side, teams stressed scoping and architecture choices that intentionally reduce risk. One practical pattern was a “one tool = one problem” approach, with different bots for different users and workflows (internal legal research, internal admin FAQs, and client-facing triage) rather than trying to make a single chatbot do everything.
Building for Security & Privacy forward solutions: Security and privacy were treated as design requirements, not compliance afterthoughts — e.g., selecting an enterprise cloud environment already inside the organization’s security perimeter and choosing retrieval-augmented generation (RAG) to keep answers grounded in verified sources.
Keeping Knowledge Fresh: Teams also described curating the knowledge base (black-letter law + SME guidance) and setting a maintenance cadence so the sources stay trustworthy over time.
Figure out What You’re Measuring & How: On evaluation, Day 1 emphasized that “accuracy” isn’t a vibe — you have to measure it, iterate, and keep monitoring after launch. Practical approaches included: (1) building a small but meaningful test set from real questions, (2) defining what an “ideal answer” must include, and (3) scoring outputs on safety and value across model/prompt/RAG variations.
Teams also used internal testing with non-developer legal staff to ask real workflow questions, paired with lightweight feedback mechanisms (thumbs up/down + reason codes) and operational metrics like citations used, speed, and cost per question. A key implementation insight was that some “AI errors” are actually content errors — post-launch quality improved by fixing source content (even single missing words) and tightening prompts, supported by ongoing monitoring.
Be Ready with Policies & Governance: On deployment governance, teams highlighted a bias toward containment, transparency, and safe failure modes. One practical RAG pattern: show citations down to the page/section, display the excerpt used, and if the system can’t answer from the verified corpus, it should say so — explicitly.
There was also a clear warning about emerging security risks (especially prompt injection and attack surfaces when tools start browsing or pulling from the open internet) and the need to think about cybersecurity as capability scales from pilots to broader use. Teams described practical access controls (like 2FA) and “shareable internal agents” as ways to grow adoption without losing governance.
Be Ready for Data Access Blocks: Several Day 1 discussions surfaced the external blockers that legal aid teams can’t solve alone — especially data access and interoperability with courts and other systems.
Even when internal workflows are ready, teams run into constraints like restrictions on scraping or fragmented, jurisdiction-specific data practices, which makes replication harder and increases costs for every new deployment. That’s one reason the “lessons learned” kept circling back to shared infrastructure: common patterns for grounded knowledge, testing protocols, security hardening, and the data pathways needed to make these tools reliable in day-to-day legal work.
Strategic Crossroads: Key Debates Shaping the Future of the AI+A2J Ecosystem
The proliferation of AI has brought the access to justice community to a strategic crossroads. The Summit revealed that organizations are grappling with fundamental decisions about how to acquire, build, and deploy this technology. The choices made in the coming years will define the technological landscape of the sector, determining the cost, accessibility, and control that legal aid organizations have over their digital futures.
The Build vs. Buy Dilemma
A central tension emerged between building custom solutions and purchasing sophisticated off-the-shelf platforms. We might end up with a ‘yes and’ approach, that involves both.
The Case for Building:
Organizations like Maryland Legal Aid and Lone Star Legal Aid are pursuing an in-house development path. This is not just a cost-and-security decision but a strategic choice about building organizational capacity.
The primary drivers are significantly lower long-term costs — Maryland Legal Aid reported running their custom platform for their entire staff for less than $100 per month — and enhanced data security and privacy, achieved through direct control over the tech stack and zero-data-retention agreements with API providers.
Building allows for the precise tailoring of tools to unique organizational workflows and empowers staff to become creators.
The Case for Buying:
Conversely, presentations from Relativity, Harvey, Thomson Reuters, vLex/Clio, and others showcased the immense power of professionally developed, pre-built platforms. The argument for buying centers on leveraging cutting-edge technology and complex features without the significant upfront investment in hiring and maintaining an in-house development team.
This path offers immediate access to powerful tools for organizations that lack the capacity or desire to become software developers themselves.
Centralized Expertise vs. Empowered End-Users
A parallel debate surfaced around who should be building AI applications. The traditional model, exemplified by Lone Star Legal Aid, involves a specialized technical team that designs and develops tools for the rest of the organization.
In contrast, Maryland Legal Aid presented a more democratized vision, empowering tech-curious attorneys and paralegals to engage in “vibe coding.”
This approach envisions non-technical staff becoming software creators themselves, using new, user-friendly AI development tools to rapidly build and deploy solutions. It transforms end-users into innovators, allowing legal aid organizations to “start solving their own problems” fast, cheaply, and in-house.
Navigating the Role of Big Tech in Justice Services
The summit highlighted the inescapable and growing role of major technology companies in the justice space. The debate here centers on the nature of the engagement.
One path involves close collaboration, such as licensing tools like Notebook LM from Google or leveraging APIs from OpenAI to power custom applications.
The alternative is a more cautious approach that prioritizes advocacy for regulation, taxation and licensing legal orgs’ knowledge and tools, and the implementation of robust public interest protections to ensure that the deployment of large-scale AI serves, rather than harms, the public good.
These strategic debates are shaping the immediate future of legal technology, but the summit also issued a more profound challenge: to use this moment not just to optimize existing processes, but to reimagine the very foundations of justice itself.
AI Beyond Automation: Reimagining the Fundamentals of the Justice System
The conversation at the summit elevated from simply making the existing justice system more efficient to fundamentally transforming it for a new era.
In a thought-provoking remote address, Professor Richard Susskind challenged attendees to look beyond the immediate applications of AI and consider how it could reshape the core principles of dispute resolution and legal help. This forward-looking perspective urged the community to avoid merely automating the past and instead use technology to design a more accessible, preventative, and outcome-focused system of justice.
The Automation Fallacy
Susskind warned against what he termed “technological myopia” — the tendency to view new technology only through the lens of automating existing tasks. He argued that simply replacing human lawyers with AI to perform the same work is an uninspired goal. Using a powerful analogy, he urged the legal community to avoid focusing on the equivalent of “robotic surgery” (perfecting an old process) and instead seek out the legal equivalents of “non-invasive therapy” and “preventative medicine” — entirely new, more effective ways to achieve just outcomes.
Focusing Upstream
This call to action was echoed in a broader directive to shift focus from downstream dispute resolution to upstream interventions. The goal is to leverage technology and data not just to manage conflicts once they arise, but to prevent them from escalating in the first place. This concept was vividly captured by Susskind’s metaphor of a society that is better served by “putting a fence at the top of the cliff rather than an ambulance at the bottom.”
The Future of Dispute Resolution
Susskind posed the provocative question, “Can AI replace judges?” but quickly reframed it to be more productive. Instead of asking if a machine can replicate a human judge, he argued the focus should be on outcomes: can AI systems generate reliable legal determinations with reasons?
He envisioned a future, perhaps by 2030, where citizens might prefer state-supported, AI-underpinned dispute services over traditional courts. In this vision, parties could submit their evidence and arguments to a “comfortingly branded” AI system that could cheaply, cheerfully, and immediately deliver a conclusion, transforming the speed and accessibility of justice.
Achieving such ambitious, long-term visions requires more than just technological breakthroughs; it demands the creation of a practical, collaborative infrastructure to build and sustain this new future.
Building Funding and Capacity for this Work
On the panel about building a National AI + A2J ecosystem, panelists discussed how to increase capacity and impact in this space.
The Need to Make this Space Legible as a Market
The panel framed the “economics” conversation as a market-making challenge: if we want new tech to actually scale in access to justice, we have to make the space legible — not just inspiring. There could be a clearer market for navigation tech in low-income “fork-in-the-road” moments. The panel highlighted that the nascent ecosystem needs three things to become investable and durable:
clearly defined problems,
shared infrastructure that makes building and scaling easier, and
business models that sustain products over time.
A key through-line in the panel’s commentary was: we can’t pretend grant funding alone will carry the next decade of AI+A2J delivery. Panelists suggested we need experimentation to find new payers — for example, employer-funded benefits and EAP dollars, or insurer/health-adjacent funding tied to social determinants of health — paired with stronger evidence that tools improve outcomes. This is connected to the need for shared benchmarks and evaluation methods that can influence how developers build and how funders (and institutions) decide what to back.
A Warning Not to Build New Tech on Bad Processes
The panel also brought a grounding reality check: even the best tech will underperform — or do harm — if it’s layered onto broken processes. Tech projects where tech sat on top of high-default systems contributed to worse outcomes.
The economic implication was clear: funders and institutions should pay for process repair and procedural barrier removal as seriously as they pay for new tools, because the ROI of AI depends on the underlying system actually functioning.
The Role of Impact Investing as a new source of capital
Building this ecosystem requires a new approach to funding. Kate Fazio framed the justice gap as a fundamental “market failure” in the realm of “people law” — the everyday legal problems faced by individuals. She argued that the two traditional sources of capital are insufficient to solve this failure: traditional venture capital is misaligned, seeking massive returns that “people law” cannot generate, while philanthropy is vital but chronically resource-constrained.
The missing piece, Fazio argued, isimpact investing: a form of patient, flexible capital that seeks to generate both a measurable social impact and a financial return. This provides a crucial middle ground for funding sustainable, scalable models that may not offer explosive growth but can create enormous social value. But she highlighted a stark reality: of the 17 UN Sustainable Development Goals, Goal 16 (Peace, Justice, and Strong Institutions) currently receives almost no impact investment capital. This presents both a monumental challenge and a massive opportunity for the A2J community to articulate its value and attract a new, powerful source of funding to build the future of justice.
This talk of new capital, market-making, and funding strategies started to point the group to a clear strategic imperative. To overcome the risk of fragmented pilots and siloed innovation, the A2J community must start coalescing into a coherent ecosystem. This means embracing collaborative infrastructure, which can be hand-in-hand with attracting new forms of capital.
By reframing the “market failure” in people law as a generational opportunity for impact investing, the sector can secure the sustainable funding needed to scale the transformative, preventative, and outcome-focused systems of justice envisioned throughout the summit.
Forging an AI+A2J Ecosystem: The Path to Sustainable Scale and Impact
On Day 2, we challenged groups to envision how to build a strong AI and A2J development, evaluation, and market ecosystem. They came up with so many ideas, and we try to capture them below. Much of it is about having common infrastructure, shared capacity, and better ways to strengthen and share organic DIY AI tools.
A significant risk facing the A2J community is fragmentation, a scenario where “a thousand pilots bloom” but ultimately fail to create lasting, widespread change because efforts are siloed and unsustainable. The summit issued a clear call to counter this risk by adopting a collaborative ecosystem approach.
The working groups on Day 2 highlighted some of the key things that our community can work on, to build a stronger and more successful A2J provider ecosystem. This infrastructure-centered strategy emphasizes sharing knowledge, resources, and infrastructure to ensure that innovations are not only successful in isolation but can be sustained, scaled, and adapted across the entire sector.
Throughout the summit, presenters and participants highlighted the essential capacities and infrastructure that individual organizations must develop to succeed with AI. Building these capabilities in every single organization is inefficient and unrealistic. An ecosystem approach recognizes the need for shared infrastructure, including the playbooks, knowledge/data standards, privacy and security tooling, evaluation and certification, and more.
Replicable Playbooks to Prevent Parallel Duplication
Many groups in the Summit called for replicable solutions playbooks, that go beyond sharing repositories on Github and making conference presentations, and getting to the teams and resources that can help more legal teams replicate successful AI solutions and localize them to their jurisdiction and organization.
A2J organizations don’t just need inspiration — they need proven patterns they can adopt with confidence. Replicable “how-tos” turn isolated success stories into field-level capability: how to scope a use case, how to choose a model approach, how to design a safe workflow, how to test and monitor performance, and how to roll out tools to staff without creating chaos. These playbooks reduce the cost of learning, lower risk, and help organizations move from pilots to sustained operations.
Replicable guidance also helps prevent duplication. Right now, too many teams are solving the same early-stage problems in parallel: procurement questions, privacy questions, evaluation questions, prompt and retrieval design, and governance questions. If the field can agree on shared building blocks and publish them in usable formats, innovation becomes cumulative — each new project building on the last instead of starting over.
A Common Agenda of What Tasks-Issues to Build Solutions for
Without a shared agenda, the field risks drifting into fragmentation: dozens of pilots, dozens of platforms, and no cumulative progress. A common agenda does not mean one centralized solution — it means alignment on what must be built together, what must be measured, and what must be stewarded over time. It creates shared language, shared priorities, and shared accountability across courts, legal aid, community organizations, researchers, funders, and vendors.
This is the core reason the Legal Design Lab held the Summit: to convene the people who can shape that shared agenda and to produce a practical roadmap that others can adopt. The goal is to protect this moment from predictable failure modes — over-chill, backlash, duplication, and under-maintained tools — and instead create an ecosystem where responsible innovation compounds, trust grows, and more people get real legal help when they need it.
Evaluation Protocols and Certifications
Groups also called for more, easier evaluation and certification. They want high-quality, standardized methods for evaluation, testing, and long-term maintenance.
In high-stakes legal settings, “seems good” is not good enough. The field needs clear definitions of quality and safety, and credible evaluation protocols that different organizations can use consistently. This doesn’t mean one rigid standard for every tool — but it does mean shared expectations: what must be tested, what must be logged, what harms must be monitored, and what “good enough” looks like for different risk levels.
Certification — or at least standard conformance levels — can also shift the market. If courts and legal aid groups can point to transparent evaluation and safety practices, then vendors and internal builders alike have a clear target. That reduces fear-driven overreaction and replaces it with evidence-driven decision-making. Over time, it supports responsible procurement, encourages better products, and protects the public by making safety and accountability visible.
In addition, creating legal benchmarksfor the most common & significant legal tasks can push LLM developers to improve their foundational models for justice use cases
Practical, Clear Privacy Protections
A block for many of the possible solutions is the safe use of AI with highly confidential, risky data. Privacy is not a footnote in A2J — it is the precondition for using AI with real people. Many of the highest-value workflows involve sensitive information: housing instability, family safety, immigration status, disability, finances, or criminal history. If legal teams cannot confidently protect client data, they will either avoid the tools entirely or use them in risky ways that expose clients and organizations to harm.
What is needed is privacy-by-design infrastructure: clear rules for data handling, retention, and access; secure deployment patterns; strong vendor contract terms; and practical training for staff about what can and cannot be used in which tools. The Summit is a place to align on what “acceptable privacy posture” should look like across the ecosystem — so privacy does not become an innovation-killer, and innovation does not become a privacy risk.
More cybersecurity, testing, reliability engineering, and ongoing monitoring
Along with privacy risks, participants noted that many of the organic, DIY solutions are not prepared for cybersecurity risks. As AI tools become embedded in legal workflows, they become targets — both for accidental failures and deliberate attacks. Prompt injection, data leakage, insecure integrations, and overbroad permissions can turn a helpful tool into a security incident. And reliability matters just as much as brilliance: a tool that works 80% of the time may still be unusable in high-stakes practice if the failures are unpredictable.
The field needs a stronger norm of “safety engineering”: threat modeling, red-teaming, testing protocols, incident response plans, and ongoing monitoring after deployment. This is also where shared infrastructure helps most. Individual organizations should not each have to invent cybersecurity practices for AI from scratch. A common set of testing and security baselines would let innovators move faster while reducing systemic risk.
Inter-Agency/Court Data Connections
Many groups need to call up and work with data from other agencies — like court docket files and records, other legal aid groups’ data, and more — in order to get highly effective, AI-powered workflows
Participants called for more standards and data contracts that can facilitate systematic data access, collection, and preparation. Many of the biggest A2J bottlenecks are not about “knowing the law” — they’re about navigating fragmented systems. People have to repeat their story across multiple offices, programs, and portals. Providers can’t see what happened earlier in the journey. Courts don’t receive information in consistent, structured ways. The result is duplication, delay, and drop-off — exactly where AI could help, but only if the data ecosystem supports it.
Many of the biggest A2J bottlenecks are not about “knowing the law” — they’re about navigating fragmented systems. People have to repeat their story across multiple offices, programs, and portals. Providers can’t see what happened earlier in the journey. Courts don’t receive information in consistent, structured ways. The result is duplication, delay, and drop-off — exactly where AI could help, but only if the data ecosystem supports it.
Data Contracts for Interoperable Knowledge Bases
Many local innovators are starting to build out structured, authoritative knowledge on court procedure, forms and documents, strategies, legal authorities, service directories, and more. This knowledge data is built to power their local legal AI solutions, but right now it is stored and saved in unique local ways.
This investment in local authoritative legal knowledge bases makes sense. LLMs are powerful, but they are not a substitute for authoritative, maintainable legal knowledge. The most dependable AI systems in legal help will be grounded in structured knowledge: jurisdiction-specific procedures, deadlines, forms, filing rules, court locations, service directories, eligibility rules, and “what happens next” pathways.
But the worry among participants is that all of these highly localized knowledge bases will be one-off for a specific org or solution. Ideally, when teams are investing in building these local knowledge bases, it can follow some key standard rules so it can perform well and it can be updated, audited, and reused across tools and regions.
This is why knowledge bases and data exchanges are central to the ecosystem approach. Instead of each organization maintaining its own isolated universe of content, we can build shared registries and common schemas that allow local control while enabling cross-jurisdiction learning and reuse. The aim is not uniformity for its own sake — it’s reliability, maintainability, and the ability to scale help without scaling confusion.
More training and change management so legal teams are ready
Even the best tools fail if people don’t adopt them in real workflows. Legal organizations are human systems with deeply embedded habits, risk cultures, and informal processes. Training and change management are not “nice to have” — they determine whether AI becomes a daily capability or a novelty used by a handful of early adopters.
What’s needed is practical, role-based readiness support: training for leadership on governance and procurement, training for frontline staff on safe use and workflow integration, and support for managers who must redesign processes and measure outcomes. The Summit is a step toward building a shared approach to readiness — so the field can absorb change without burnout, fragmentation, or loss of trust.
Building Capability & Lowering Costs of Development/Use
One of the biggest barriers to AI-A2J impact is that the “real” version of these tools — secure deployments, quality evaluation, integration into existing systems, and sustained maintenance — can be unaffordable when each court or legal aid organization tries to do it alone. The result is a familiar pattern: a few well-resourced organizations build impressive pilots, while most teams remain stuck with limited access, short-term experiments, or tools that can’t safely touch real client data.
Coordination is the way out of this trap. When the field aligns on shared priorities and shared building blocks, we reduce duplication and shift spending away from reinventing the same foundational components toward improving what actually matters for service delivery.
Through coordination, the ecosystem can also change the economics of AI itself. Shared evaluation protocols, reference architectures, and standard data contracts mean vendors and platform providers can build once and serve many — lowering per-organization cost and making procurement less risky. Collective demand can also create better terms: pooled negotiation for pricing, clearer requirements for privacy/security, and shared expectations about model behavior and transparency.
Just as importantly, coordinated open infrastructure — structured knowledge bases, service directories, and interoperable intake/referral data — reduces reliance on expensive bespoke systems by making high-value components reusable across jurisdictions.
The goal is not uniformity, but a commons: a set of shared standards and assets that makes safe, high-quality AI deployment feasible for the median organization, not just the best-funded one.
Conclusion
The AI + Access to Justice Summit is designed as a yearly convening point — because this work can’t be finished in a single event. Each year, we’ll take stock of what’s changing in the technology, what’s working on the ground in courts and legal aid, and where the biggest gaps remain. More importantly, we’ll use the Summit to move from discussion to shared commitments: clearer priorities, stronger relationships across the ecosystem, and concrete next steps that participants can carry back into their organizations and collaborations.
We are also building the Summit as a launchpad for follow-through. In the months after convening, we will work with participants to continue progress on common infrastructure: evaluation and safety protocols, privacy and security patterns, interoperable knowledge and data standards, and practical implementation playbooks that make adoption feasible across diverse jurisdictions. The aim is to make innovation cumulative — so promising work does not remain isolated in a single pilot site, but becomes reusable and improvable across the field.
We are deeply grateful to the sponsors who made this convening possible, and to the speakers who shared lessons, hard-won insights, and real examples from the frontlines.
Most of all, thank you to the participants — justice leaders, technologists, researchers, funders, and community partners — who showed up ready to collaborate, challenge assumptions, and build something larger than any single organization can create alone. Your energy and seriousness are exactly what this moment demands, and we’re excited to keep working together toward a better 2030.
The Stanford Legal Design Lab is so happy to be a sponsoring co-host of the third consecutive AI and Access to Justice workshop at the JURIX conference. This round, the conference is at Turin, Italy in December 2025. The theme is AI, Dispute Resolution, and Access to Justice.
Data issues related to access to justice (building reusable, sharable datasets for research)
AI for access to justicegenerally
AI for dispute resolution
The provisional schedule is as follows:
Session 1 – Interfaces and Knowledge Tools
LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents CourtPressGER: A German Court Decision to Press Release Summarization Dataset A Voice-First AI Service for People-Centred Justice in Niger Designing Clarity with AI: Improving the Usability of Case Law Databases of International Courts CaseConnect: Cross-Lingual Legal Case Retrieval with Semantic Embeddings and Structure-Aware Segmentation Glitter: Visualizing Lexical Surprisal for Readability in Administrative Texts
Session 2 – Global AI for Legal Help, Prediction and Dispute Resolution
Understanding Rights Through AI: The Role of Legal Chatbots in Access to Justice The Private Family Forecast: A Predictive Method for an Effective and Informed Access to Justice Artificial Intelligence Enabled Justice Tools for Refugees in Tanzania Artificial Intelligence and Access to Justice in Chile AI and Judicial Transformation: Comparative Analysis of Predictive Tools in EU Labour Law From Textual Simplification to Epistemic Justice: Rethinking Digital Dispute Resolution Through AI
Session 3 – Workflows, Frameworks and Governance of Legal AI PLDF – A Private Legal Declarative Document Generation Framework How the ECtHR Frames Artificial Intelligence: A Distant Reading Analysis What Legal Help Teams and Consumers Actually Do: A Legal Help Task Taxonomy Packaging Thematic Analysis as an AI Workflow for Legal Research Gender Bias in LLMs: Preliminary Evidence from Shared Parenting Scenario in Czech Family Law AI-Powered Resentencing: Bridging California’s Second-Chance Gap AI Assistance for Court Review of Default Judgments
Interactive Workshop – Global Legal Data Availability
The Stanford Legal Design Lab hosted the second annual AI and Access to Justice Summit on November 20-21, 2025. Over 150 legal professionals, technologists, regulators, strategists, and funders came together to tackle one big question: how can we build a strong, sustainable national/international AI and Access to Justice Ecosystem?
We will be synthesizing all of the presentations, feedback, proposals and discussions into a report that lays out:
The current toolbox that legal help teams and users can be employing to accomplish key legal tasks like Q&A, triage and referrals, conducting intake interviews, drafting documents, doing legal research, reviewing draft documents, and more.
The strategies, practical steps, and methods with which to design, develop, evaluate, and maintain AI so that it is valuable, safe, and affordable.
Exemplary case studies of what AI solutions are being built, how they are being implemented in new service and business models, and how they might be scaled or replicated.
An agenda of how to encourage more coordination of AI technology, evaluation, and capability-building, so that successful solutions can be available to as many legal teams and users as possible — and have the largest positive impact on people’s housing, financial, family, and general stability.
Thank you to all of our speakers, participants, and sponsors!
If AI is going to advance access to justice rather than deepen the justice gap, the public-interest legal field needs more than speculation and pilots — we need statewide stewardship.
2 missions of an AI steward, for a state’s legal help service provider community
We need specific people and institutions in every state who wake up each morning responsible for two things:
AI readiness and vision for the legal services ecosystem: getting organizations knowledgeable, specific, and proactive about where AI can responsibly improve outcomes for people with legal problems — and improve the performance of services. This can ensure the intelligent and impactful adoption of AI solutions as they are developed.
AI R&D encouragement and alignment: getting vendors, builders, researchers, and benchmark makers on the same page about concrete needs; matchmaking them with real service teams; guiding, funding, evaluating, and communicating so the right tools get built and adopted.
Ideally, these local state stewards will be talking with each other regularly. In this way, there can be federated research & development of AI solutions for legal service providers and the public struggling with legal problems.
This essay outlines what AI + Access to Justice stewardship could look like in practice — who can play the role, how it works alongside court help centers and legal aid, and the concrete, near-term actions a steward can take to make AI useful, safe, and truly public-interest.
State stewards can help local legal providers — legal aid groups, court help centers, pro bono networks, and community justice workers — to set a clear vision for AI futures & help execute it.
Why stewardship — why now?
Every week, new tools promise to draft, translate, summarize, triage, and file. Meanwhile, most legal aid organizations and court help centers are still asking foundational questions: What’s safe? What’s high-value? What’s feasible with our staff and privacy rules? How do we avoid vendor lock-in? How do we keep equity and client dignity at the center?
Without stewardship, AI adoption will be fragmented, extractive, and inequitable. With stewardship, states can:
Focus AI where it demonstrably helps clients and staff. Prioritize tech based on community and provider stakeholders’ needs and preferences — not just what is being sold by vendors.
Prepare data and knowledge so tools work in the local contexts. Also, that they can be trained safely & benchmarked responsibly with relevant data that is masked and safe.
Align funders, vendors, and researchers around real service needs. So that all of these stakeholder groups, with their capacity to support, build, and evaluate emerging technology, direct this capacity at opportunities that are meaningful.
Develop shared evaluation and governance so we build trust, not backlash.
Who can play the Statewide AI Steward role?
“Steward” is a role, not a single job title. Different kinds of groups can carry it, depending on how your state is organized:
Access to Justice Commissions / Bar associations / Bar foundations that convene stakeholders, fund statewide initiatives, and set standards.
Legal Aid Executive Directors (or cross-org consortia) with authority to coordinate practice areas and operations.
Court innovation offices / judicial councils that lead technology, self-help, and rules-of-court implementations.
University labs / legal tech nonprofits that have capacity for research, evaluation, data stewardship, and product prototyping.
Regional collaboratives with a track record of shared infrastructure and implementation.
Any of these can steward. The common denominator: local trusted relationships, coordination power, and delivery focus.The steward must be able to convene local stakeholders, communicate with them, work with them on shared training and data efforts, and move from talk to action.
The steward’s two main missions
Mission 1: AI readiness + vision (inside the legal ecosystem)
The steward gets legal organizations — executive directors, supervising/managing attorneys, practice leads, intake supervisors, operations staff — knowledgeable and specific about where AI can responsibly improve outcomes. This means:
Translating AI into service-level opportunities (not vague “innovation”).
Running short, targeted training sessions for leaders and teams.
Co-designing workflow pilots with clear review and safety protocols.
Building a roadmap: which portfolios, which tools, what sequence, what KPIs.
Clarify ethical, privacy, and consumer/client safety priorities and strategies, to talk about risks and worries in specific, technically-informed ways that provide sufficient protection to users and orgs — and don’t fall into inaction because of ill-defined concern about risk.
The result: organizations are in charge of the change rather than passive recipients of vendor pitches or media narratives.
2) AI tech encouragement + alignment (across the supply side)
The steward gets the groups who specialize in building and evaluating technology — vendors, tech groups, university researchers, benchmarkers— pointed at the right problems with the right real-world partnerships:
Publishing needs briefs by portfolio (housing, reentry, debt, family, etc).
Matchmaking teams and vendors; structuring pilots with data, milestones, evaluation, and governance. Helping organizations choose a best-in-class vendor and then also manage this relationship with regular evaluation.
Contributing to benchmarks, datasets, and red-teaming so the field learns together. Build the infrastructure that can lead to effective, ongoing evaluation of how AI systems are performing.
Helping fund and scale what works; communicating results frankly. Ensuring that prototypes and pilots’ outcomes are shared to inform others of what they might adopt, or what changes must happen to the AI solutions for them to be adopted or scaled.
The result: useful and robust AI solutions built with frontline reality, evaluated transparently, and ready to adopt responsibly.
What Stewards Could Do Month-to-Month
I have been brainstorming specific actions that a statewide steward could do. Many of these actions could also be done in concert with a federated network of stewards.
Some of the things a state steward could do to advance responsible, impactful AI for Access to Justice in their region.
Map the State’s Ecosystem of Legal Help
Too often, we think in terms of organizations — “X Legal Aid,” “Y Court Help Center” — instead of understanding who’s doing the actual legal work.
Each state needs to start by identifying the legal teams operating within its borders.
Who is doing eviction defense?
Who helps people with no-fault divorce filings?
Who handles reasonable accommodation letters for tenants?
Who runs the reentry clinic or expungement help line?
Who offers debt relief letter assistance?
Who does restraining order help?
This means mapping not just legal help orgs, but service portfolios and delivery models. What are teams doing? What are they not doing? And what are the unmet legal needs that clients consistently face?
This is a service-level analysis — an inventory of the “market” of help provided and the legal needs not yet met.
AI Training for Leaders + Broader Legal Organizations
Most legal aid and court help staff are understandably cautious about AI. Many don’t feel in control of the changes coming — they feel like they’re watching the train leave the station without them.
The steward’s job is to change that.
Demystify AI: Explain what these systems are and how they can support (or undermine) legal work.
Coach teams: Help practice leads and service teams see which parts of their work are ripe for AI support.
Invite ownership: Position AI not as a threat, but as a design space — a place where legal experts get to define how tools should work, and where lawyers and staff retain the power to review and direct.
To do this, stewards can run short briefings for EDs, intake leads, and practice heads on LLM basics, use cases, risks, UPL and confidentiality, and adoption playbooks. Training aims to get them conversant in the basics of the technology and help them envision where responsible opportunities might be. Let them see real-world examples of how other legal help providers are using AI behind the scenes or directly to the public.
Brainstorm + Opportunity Mapping Workshops with Legal Teams
Bring housing teams, family law facilitator teams, reentry teams, or other specific legal teams together. Have them map out their workflows and choose which of their day-to-day tasks is AI-opportune. Which of the tasks are routine, templated, and burdensome?
As stewards run these workshops, they can be on the lookout for where legal teams in their state can build, buy, or adopt an AI solution in 3 areas.
When running AI opportunity brainstorm, it’s worth considering these 3 zones: where can we add to existing legal full-representation servivces, where can we add to brief or pro bono services, and where can we add services that legal teams don’t currently offer?
Brainstorm 1: AI Copilots for Services Legal Teams Already Offer
This is the lowest-risk, highest-benefit space. Legal teams are already helping with eviction defense, demand letters, restraining orders, criminal record clearing, etc.
Here, AI can act as a copilot for the expert — a tool that does things that the expert lawyer, paralegal, or legal secretary is already doing in a rote way:
Auto-generates first drafts based on intake data
Summarizes client histories
Auto-fills court forms
Suggests next actions or deadlines
Creates checklists, declarations, or case timelines
These copilots don’t replace lawyers. They reduce drudge work, improve quality, and make staff more effective.
Brainstorm 2: AI Copilots for Services That Could Be Done by Pro Bono or Volunteers
Many legal aid organizations know where they could use more help: limited-scope letters, form reviews, answering FAQs, or helping users navigate next steps.
AI can play a key role in unlocking pro bono, brief advice, and volunteer capacity:
Automating burdensome tasks like collecting or review database records,
Helping them write high-quality letters or motions
Pre-filling petitions and forms with data that has been gathered
Providing them with step-by-step guidance
Flagging errors, inconsistencies, or risks in drafts
Offering language suggestions or plain-language explanations
Think of this as AI-powered “training wheels” that help volunteers help more people, with less handholding from staff.
Brainstorm 3: AI Tools for Services That Aren’t Currently Offered — But Should Be
There are many legal problems where there is high demand, but legal help orgs don’t currently offer help because of capacity limits.
Common examples of these under-served areas include:
Security deposit refund letters
Creating demand letters
Filing objections to default judgments
Answering brief questions
In these cases, AI systems — carefully designed, tested, and overseen — can offer direct-to-consumer services that supplement the safety net:
Structured interviews that guide users through legal options
AI-generated letters/forms with oversight built in
Clear red flags for when human review is needed
This is the frontier: responsibly extending the reach of legal help to people who currently get none. The brainstorm might also include reviewing existing direct-to-consumer AI tools from other legal orgs, and deciding which they might want to host or link to from their website.
The steward can hold these brainstorming and prioritization sessions to help legal teams find these legal team co-pilots, pro bono tools, and new service offerings in their issue area. The stewards and legal teams can move the AI vision forward & prepare for a clear scope for what AI should be built.
Data Readiness + Knowledge Base Building
Work with legal and court teams to inventory what data they have that could be used to train or evaluate some of the legal AI use cases they have envisioned. Support them with tools & protocols by which to mask PII in this document and make it safe to use in AI R&D.
This could mean getting anonymized completed forms, documents, intake notes, legal answers, data reports, or other legal workflow items. Likely, much of this data will have to be labeled, scored, and marked up so that it’s useful in training and evaluation.
The steward can help the groups that hold this data to understand what data they hold, how to prepare it and share it, and how to mark it up with helpful labels.
Part of this is also to build a Local Legal Help Knowledge Base — not just about the laws and statutes on the books, but about the practical, procedural, and service knowledge that people need when trying to deal with a legal problem.
Much of this knowledge is in legal aid lawyers’ and court staff’s heads, or training decks and events, or internal knowledge management systems and memos.
Stewards can help these local organizations contribute this knowledge about local legal rules, procedures, timelines, forms, services, and step-by-step guides into a statewide knowledge base. This knowledge base can then be used by the local providers. It will be a key piece of infrastructure on which new AI tools and services can be built.
Adoption Logistics
As local AI development visions come together, the steward can lead on adoption logistics.
The steward can make sure that the local orgs don’t reinvent what might already exist, or spend money in a wasteful way.
They can do tool evaluations to see which LLMs and specific AI solutions perform best on the scoped tasks. They can identify researchers and evaluators to help with this. They can also help organizations procure these tools or even create a pool of multiple organizations with similar needs for a shared procurement process.
They might also negotiate beneficial, affordable licenses or access to AI tools that can help with the desired functions. They can also ensure that case management and document management systems are responsive to the AI R&D needs, so that the legacy technology systems will integrate well with the new tools.
Ideally, the steward will help the statewide group and the local orgs make smart investments in the tech they might need to buy or build — and can help clear the way when hurdles emerge.
Bigger-Picture Steward Strategies
In addition to these possible actions, statewide stewards can also follow a few broader strategies to get a healthy AI R&D ecosystem in their state and beyond.
Be specific to legal teams
As I’ve already mentioned throughout this essay, stewards should be focused on the ‘team’ level, rather than the ‘organization’ one. It’s important that they develop relationships and run activities with teams that are in charge of specific workflows — and that means the specific kind of legal problem they help with.
Stewardship should be organizing its statewide network of named teams and named services, for example,
Housing law teams & their workflows: hotline consults, eviction defense prep, answers, motions to set aside, trial prep, RA letters for habitability issues, security-deposit demand letters.
Reentry teams & their workflows: record clearance screening, fines & fees relief, petitions, supporting declarations, RAP sheet interpretation, collateral consequences counseling.
Debt/consumer teams & their workflows: answer filing, settlement letters, debt verification, exemptions, repair counseling, FDCPA dispute letters.
Family law teams & their workflows: form prep (custody, DV orders), parenting plans, mediation prep, service and filing instructions, deadline tracking.
The steward can make progress on its 2 main goals — AI readiness and R&D encouragement — if it can build a strong local network among the teams that work on similar workflows, with similar data and documents, with similar audiences.
Put ethics, privacy, and operational safeguards at the center
Stewardship builds trust by making ethics operational rather than an afterthought. This all happens when AI conversations are grounded, informed, and specific among legal teams and communities. It also happens when they work with trained evaluators, who know how to evaluate the performance of AI rigorously, not based on anecdotes and speculation.
The steward network can help by planning out and vetting common, proven strategies to ensure quality & consumer protection are designed into the AI systems. They could work on:
Competence & supervision protocols: helping legal teams plan for the future of expert review of AI systems, clarifying “eyes-on” review models with staff trainings and tools. Stewards can also help them plan for escalation paths, when human reviewers find problems with the AI’s performance. Stewards might also work on standard warnings, verification prompts, and other key designs to ensure that reviewers are effectively watching AI’s performance.
Professional ethics rules clarity: help the teams design internal policies that ensure they’re in compliance with all ethical rules and responsibilities. Stewards can also help them plan out effective disclosures and consent protocols, so consumers know what is happening and have transparency.
Confidentiality & privacy: This can happen at the federated/ national level. Stewards can set rules for data flows, retention, de-identification/masking — which otherwise can be overwhelming for specific orgs. Stewards can also vet vendors for security and subprocessing.
Accountability & Improvements: Stewards can help organizations and vendors plan for good data-gathering & feedback cycles about AI’s performance. This can include guidance on document versioning, audit logs, failure reports, and user feedback loops.
Stewards can help bake safeguards into workflows and procurement, so that there are ethics and privacy by design in the technical systems that are being piloted.
Networking stewards into a federated ecosystem
For statewide stewardship to matter beyond isolated pilots, stewards need to network into a federated ecosystem — a light but disciplined network that preserves local autonomy while aligning on shared methods, shared infrastructure, and shared learning.
The value of federation is compounding: each state adapts tools to local law and practice, contributes back what it learns, and benefits from the advances of others. Also, many of the tasks of a steward — educating about AI, building ethics and safeguards, measuring AI, setting up good procurement — will be quite similar state-to-state. Stewards can share resources and materials to implement locally.
What follows reframes “membership requirements” as the operating norms of that ecosystem and explains how they translate into concrete habits, artifacts, and results.
Quarterly check-ins become the engine of national learning. Stewards participate in a regular virtual cohort, not as a status ritual but as an R&D loop. Each session surfaces what was tried, what worked, and what failed — brief demos, before/after metrics, and annotated playbooks.
Stewards use these meetings to co-develop materials, evaluation rubrics, funding strategies, disclosure patterns, and policy stances, and to retire practices that didn’t pan out. Over time, this cadence produces a living canon of benchmarks and templates that any newcomer steward can adopt on day one.
Each year, the steward could champion at least one pilot or evaluation (for example, reasonable-accommodation letters in housing or security-deposit demand letters in consumer law), making sure it has clear success criteria, review protocols, and an exit ramp if risks outweigh benefits. This can help the pilots spread to other jurisdictions more effectively.
Shared infrastructure is how federation stays interoperable. Rather than inventing new frameworks in every state, stewards lean on common platforms for evaluation, datasets, and reusable workflows. Practically, that means contributing test cases and localized content, adopting shared rubrics and disclosure patterns, and publishing results in a comparable format.
It also means using common identifiers and metadata conventions so that guides, form logic, and service directories can be exchanged or merged without bespoke cleanup. When a state localizes a workflow or improves a safety check, it pushes the enhancement upstream, so other states can pull it down and adapt with minimal effort.
Annual reporting turns stories into evidence and standards. Each steward could publish a concise yearly report that covers: progress made, obstacles encountered, datasets contributed (and their licensing status), tools piloted or adopted (and those intentionally rejected), equity and safety findings, and priorities for the coming year.
Because these reports follow a common outline, they are comparable across states and can be aggregated nationally to show impact, surface risks, and redirect effort. They also serve as onboarding guides for new teams: “Here’s what to try first, here’s what to avoid, here’s who to call.”
Success in 12–18 months looks concrete and repeatable. In a healthy federation, we could point to a public, living directory of AI-powered teams and services by portfolio, with visible gaps prioritized for action.
We could have several legal team copilots embedded in high-volume workflows — say, demand letters, security-deposit letters, or DV packet preparation — with documented time savings, quality gains, and staff acceptance.
We could have volunteer unlocks, where a clinic or pro bono program helps two to three times more people in brief-service matters because a copilot provides structure, drafting support, and review checkers.
We could have at least one direct-to-public workflow launched in a high-demand, manageable-risk area, with clear disclosures, escalation rules, and usage metrics.
We would see more contributions to data-driven evaluation practices and R&D protocols. This could be localized guides, triage logic, form metadata, anonymized samples, and evaluation results. Or it could be an ethics and safety playbook that is not just written but operationalized in training, procurement, and audits.
A federation of stewards doesn’t need heavy bureaucracy. It could be a set of light, disciplined habits that make local work easier and national progress faster. Quarterly cohort exchanges prevent wheel-reinventing. Local duties anchor AI in real services. Shared infrastructure keeps efforts compatible. Governance protects the public-interest character of the work. Annual reports convert experience into standards.
Put together, these practices allow stewards to move quickly and responsibly — delivering tangible improvements for clients and staff while building a body of knowledge the entire field can trust and reuse.
Stewardship as the current missing piece
Our team at Stanford Legal Design Lab is aiming for an impactful, ethical, robust ecosystem of AI in legal services. We are building the platform JusticeBench to be a home base for those working on AI R&D for access to justice. We are also building justice co-pilots directly with several legal aid groups.
But to build this robust ecosystem, we need local stewards for state jurisdictions across the country — who can take on key leadership roles and decisions — and make sure that there can be A2J AI that responds to local needs but benefits from national resources. Stewards can also help activate local legal teams, so that they are directing the development of AI solutions rather than reacting to others’ AI visions.
We can build legal help AI state by state, team by team, workflow by workflow. But we need stewards who keep clients, communities, and frontline staff at the center, while moving their state forward.
That’s how AI becomes a force for justice — because we designed it that way.
The Legal Design Lab is excited to co-organize a new workshop at the International Conference on Artificial Intelligence and Law (ICAIL 2025):
AI for Access to Justice (AI4A2J@ICAIL 2025) 📍 Where? Northwestern University, Chicago, Illinois, USA 🗓 When? June 20, 2025 (Hybrid – in-person and virtual participation available) 📄 Submission Deadline: May 4, 2025 📬 Acceptance Notification: May 18, 2025
This workshop brings together researchers, technologists, legal aid practitioners, court leaders, policymakers, and interdisciplinary collaborators to explore the potential and pitfalls of using artificial intelligence (AI) to expand access to justice (A2J). It is part of the larger ICAIL 2025 conference, the leading international forum for AI and law research, hosted this year at Northwestern University in Chicago.
Why this workshop?
Legal systems around the world are struggling to meet people’s needs—especially in housing, immigration, debt, and family law. AI tools are increasingly being tested and deployed to address these gaps: from chatbots and form fillers to triage systems and legal document classifiers. Yet these innovations also raise serious questions around risk, bias, transparency, equity, and governance.
This workshop will serve as a venue to:
Share and critically assess emerging work on AI-powered legal tools
Discuss design, deployment, and evaluation of AI systems in real-world legal contexts
Learn from cross-disciplinary perspectives to better guide responsible innovation in justice systems
What are we looking for?
We welcome submissions from a wide range of contributors—academic researchers, practitioners, students, community technologists, court innovators, and more.
We’re seeking:
Research papers on AI and A2J
Case studies of AI tools used in courts, legal aid, or nonprofit contexts
Design proposals or system demos
Critical perspectives on the ethics, policy, and governance of AI for justice
Evaluation frameworks for AI used in legal services
Collaborative, interdisciplinary, or community-centered work
Topics might include (but are not limited to):
Legal intake and triage using large language models (LLMs)
AI-guided form completion and document assembly
Language access and plain language tools powered by AI
Risk scoring and case prioritization
Participatory design and co-creation with affected communities
Bias detection and mitigation in legal AI systems
Evaluation methods for LLMs in legal services
Open-source or public-interest AI tools
We welcome both completed projects and works-in-progress. Our goal is to foster a diverse conversation that supports learning, experimentation, and critical thinking across the access to justice ecosystem.
Workshop Format
The workshop will be held on June 20, 2025 in hybrid format—with both in-person sessions in Chicago, Illinois and the option for virtual participation. Presenters and attendees are welcome to join from anywhere.
Workshop Committee
Hannes Westermann, Maastricht University Faculty of Law
Jaromír Savelka, Carnegie Mellon University
Marc Lauritsen, Capstone Practice Systems
Margaret Hagan, Stanford Law School, Legal Design Lab
Submissions are due by May 4, 2025. Notifications of acceptance will be sent by May 18, 2025.
We’re thrilled to help convene this conversation on the future of AI and justice—and we hope to see your ideas included. Please spread the word to others in your network who are building, researching, or questioning the role of AI in the justice system.
Lessons from Cristina Llop’s Work on Language Access in the Legal System
Artificial intelligence (AI) and machine translation (MT) are often seen as tools with the potential to expand access to justice, especially for non-English speakers in the U.S. legal system. However, while AI-driven translation tools like Google Translate and AutoML offer impressive accuracy in general contexts, their effectiveness in legal settings remains questionable.
At the Stanford Legal Design Lab’s AI and Access to Justice research webinar on February 7, 2025, legal expert Cristina Llop shared her observations from reviewing live translations between legal providers’ staff and users. Her findings highlight both the potential and pitfalls of using AI for language access in legal settings. This article explores how AI performs in practice, where it can be useful, and why human oversight, national standards, and improved training datasets are critical.
How Machine Translation Performs in Legal Contexts
Many courts and legal service providers have turned to AI-powered Neural Machine Translation (NMT) models like Google Translate to help bridge language barriers. While AI is improving, Llop’s research suggests that accuracy in general language translation does not necessarily translate to legal language accuracy.
1. The Good: AI Can Be Useful in Certain Scenarios
Machine translation tools can provide immediate, cost-effective assistance in specific legal language tasks, such as:
Translating declarations and witness statements
Converting court forms and pleadings into different languages
Making legal guides and court websites more accessible
Supporting real-time interpretation in court help centers and clerk offices
This can be especially valuable in resource-strapped courts and legal aid groups that lack human interpreters for every case. However, Llop cautions that even when AI-generated translations sound fluent, they may not be legally precise or safe to rely on.
AI doesn’t pick up on legal context and mis-translates key information about trials, filing, court, and options.
2. The Bad: Accuracy Breaks Down in Legal Contexts
Llop identified systematic mistranslations that could have serious consequences:
Common legal terms are mistranslated due to a lack of specialized training data. For example, “warrant” is often translated as “court order,” which downplays the severity of a legal document.
Contextual misunderstandings lead to serious errors:
“Due date” was mistranslated as “date to give birth.”
“Trial” was often translated as “test.”
“Charged with a battery a case” turned into “loaded with a case of batteries.”
Pronoun confusion creates ambiguity:
Spanish’s use of “su” (your/his/her/their) is often mistranslated in legal documents, leading to uncertainty about property ownership, responsibility, or court filings.
In restraining order cases, it was unclear who was accusing whom, which could put victims at risk.
AI can introduce gender biases:
Words with no inherent gender (e.g., “politician”) are often translated as male.
The Spanish “Me Maltrata”, which could be translated either as She mistreats me or He mistreats me — without the gender being specified. The machine would default “me maltrata” as “He mistreats me,” potentially distorting evidence in domestic violence cases.
Without human review, these AI-driven errors can go unnoticed, leading to severe legal consequences.
The Dangers of Mistranslation in Legal Interactions
One of the most troubling findings from Llop’s work was the invisible breakdowns in communication between legal providers and non-English speakers.
1. Parallel Conversations Instead of Communication
In many cases, both parties believed they were exchanging information, but in reality:
Legal providers were missing key facts from litigants.
Users did not realize that their information was misunderstood or misrepresented.
Critical details — such as the nature of an abuse claim or financial disclosures — were being lost.
This failure to communicate accurately could result in:
People choosing the wrong legal recourse and misunderstanding what options are available to them.
Legal provider staff making decisions based on incomplete or distorted information, providing services and option menus based on misunderstandings about the person’s scenario or preferences.
Access to justice being compromised for vulnerable litigants.
2. Why a Glossary Isn’t Enough
Some legal institutions have tried to mitigate errors by adding legal glossaries to machine translation tools. However, Llop’s research found that glossary-based corrections do not always solve the problem:
Example 1: The word “address” was provided to the AI to ensure translation to “mailing address” (instead of “home address”) in one context — but then mistakenly applied when a clerk asked, “What issue do you want to address?”
Example 2: “Will” (as in a legal document) was mistranslated when applied to the auxiliary verb “will” in regular interactions (“I will send you this form”).
Example 3: A glossary fix for “due date” worked .
“Example 4: A glossary fix for “pleading” worked but failed to adjust grammatical structure or pronoun usage.”
These patchwork fixes are not enough. More comprehensive training, oversight, and quality control are needed.
Advancing Legal Language AI: AutoML and Human Review
One promising improvement is AutoML, which allows legal organizations to train machine translation models with their own specialized legal data.
AutoML: A Step Forward, But Still Flawed
Llop’s team worked on an AutoML project by:
Collecting 8,000+ legal translation pairs from official legal sources that had been translated by experts.
Correcting AI-generated translations manually.
Feeding improved translations back into the model.
Iterating until translations were more accurate.
Results showed that AutoML improved translation quality, but major issues remained:
AI struggled with conversational context. If a prior sentence referenced “my wife,” but the next message about the wife didn’t specify gender, AI might mistakenly switch the pronoun to “he”.
AI overfit to common legal phrases, inserting “petition” even when the correct translation should have been “form.”
These challenges highlight why human review is essential.
Real-Time Machine Translation
While text-based AI translation can be refined over time, real-time translation — such as voice-to-text systems in legal offices — presents even greater challenges.
Voice-to-Text Lacks Punctuation Awareness
People do not dictate punctuation, but pauses and commas can change legal meaning. For example:
“I’m guilty” vs. “I’m not guilty” (missing comma error).
Minor misspellings or poor grammar can dramatically change a translation.
AI Struggles with Speech Patterns
Legal system users come from diverse linguistic backgrounds, making real-time translation even more difficult. AI performs poorly when users:
Speak quickly or use filler words (“um,” “huh,” “oh”).
Have soft speech or heavy accents.
Use sentence structures influenced by indigenous or regional dialects.
These challenges make it clear that AI faces major challenges in performing accurately in high-stakes legal interactions.
The Need for National Standards and Training Datasets
Llop’s research underscores a critical gap: there are no national standards, training datasets, or quality benchmark datasets for legal translation AI.
A National Legal Translation Project
Llop saw an opportunity for improvement if there were to be:
A centralized effort to collect high-quality legal translation pairs.
State-specific localization of legal terms.
Guidelines for AI usage in courts, legal aid orgs, and other institutions.
Such a standardized dataset could train AI more effectively while ensuring legal accuracy.
Training for English-Only Speakers
English-speaking legal provider staff need training on how to structure their speech for better AI translation:
Using plain language and short sentences.
Avoiding vague pronouns (“his, her, their”).
Confirming meaning before finalizing translations.
AI, Human Oversight, and National Infrastructure in Legal Translation
Machine translation and AI can be useful, but they are far from perfect. Without human review, legal expertise, and national standards, AI-generated translations could compromise access to justice.
Llop’s work highlights the urgent need for:
Human-in-the-loop AI translation.
Better training data tailored for legal contexts.
National standards for AI language access.
As AI continues to evolve, it must be designed with legal precision and human oversight — because in law, a mistranslation can change lives.
This month, our team commenced interviews with landlord-tenant subject matter experts, including court help staff, legal aid attorneys, and hotline operators. These experts are comparing and rating various AI responses to commonly asked landlord-tenant questions that individuals may get when they go online to find help.
Our team has developed a new ‘Battle Mode’ of our rating/classification platform Learned Hands. In a Battle Mode game on Learned Hands, experts compare two distinct AI answers to the same user’s query and determine which one is superior. Additionally, we have the experts speak aloud as they are playing, asking that they articulate their reasoning. This allows us to gain insights into why a particular response is deemed good or bad, helpful or harmful.
Our group will be publishing a report that evaluates the performance of various AI models in answering everyday landlord-tenant questions. Our goal is to establish a standardized approach for auditing and benchmarking AI’s evolving ability to address people’s legal inquiries. This standardized approach will be applicable to major AI platforms, as well as local chatbots and tools developed by individual groups and startups. By doing so, we hope to refine our methods for conducting audits and benchmarks, ensuring that we can accurately assess AI’s capabilities in answering people’s legal questions.
Instead of speculating about potential pitfalls, we aim to hear directly from on-the-ground experts about how these AI answers might help or harm a tenant who has gone onto the Internet to problem-solve. This means regular, qualitative sessions with housing attorneys and service providers, to have them closely review what AI is telling people when asked for information on a landlord-tenant problem. These experts have real-world experience in how people use (or don’t) the information they get online, from friends, or from other experts — and how it plays out for their benefit or their detriment.
We also believe that regular review by experts can help us spot concerning trends as early as possible. AI answers might change in the coming months & years. We want to keep an eye on the evolving trends in how large tech companies’ AI platforms respond to people’s legal help problem queries, and have front-line experts flag where there might be a big harm or benefit that has policy consequences.
Stay tuned for the results of our expert-led rating games and feedback sessions!
If you are a legal expert in landlord-tenant law, please sign up to be one of our expert interviewees below.
The Legal Design Lab is proud to announce a new monthly online, public seminar on AI & Access to Justice: Research x Practice.
At this seminar, we’ll be bringing together leading academic researchers with practitioners and policymakers, who are all working on how to make the justice system more people-centered, innovative, and accessible through AI. Each seminar will feature a presentation from either an academic or practitioner who is working in this area & has been gathering data on what they’re learning. The presentations could be academic studies about user needs or the performance of technology, or less academic program evaluations or case studies from the field.
We look forward to building a community where researchers and practitioners in the justice space can make connections, build new collaborations, and advance the field of access to justice.
Sign up for the AI&A2J Research x Practice seminar, every first Friday of the month on Zoom.