Lessons from Cristina Llop’s Work on Language Access in the Legal System
Artificial intelligence (AI) and machine translation (MT) are often seen as tools with the potential to expand access to justice, especially for non-English speakers in the U.S. legal system. However, while AI-driven translation tools like Google Translate and AutoML offer impressive accuracy in general contexts, their effectiveness in legal settings remains questionable.
At the Stanford Legal Design Lab’s AI and Access to Justice research webinar on February 7, 2025, legal expert Cristina Llop shared her observations from reviewing live translations between legal providers’ staff and users. Her findings highlight both the potential and pitfalls of using AI for language access in legal settings. This article explores how AI performs in practice, where it can be useful, and why human oversight, national standards, and improved training datasets are critical.
How Machine Translation Performs in Legal Contexts
Many courts and legal service providers have turned to AI-powered Neural Machine Translation (NMT) models like Google Translate to help bridge language barriers. While AI is improving, Llop’s research suggests that accuracy in general language translation does not necessarily translate to legal language accuracy.
1. The Good: AI Can Be Useful in Certain Scenarios
Machine translation tools can provide immediate, cost-effective assistance in specific legal language tasks, such as:
- Translating declarations and witness statements
- Converting court forms and pleadings into different languages
- Making legal guides and court websites more accessible
- Supporting real-time interpretation in court help centers and clerk offices
This can be especially valuable in resource-strapped courts and legal aid groups that lack human interpreters for every case. However, Llop cautions that even when AI-generated translations sound fluent, they may not be legally precise or safe to rely on.
![](https://i0.wp.com/justiceinnovation.law.stanford.edu/wp-content/uploads/2025/02/Ai-and-access-to-justice-language-access-visual.png?resize=580%2C257&ssl=1)
2. The Bad: Accuracy Breaks Down in Legal Contexts
Llop identified systematic mistranslations that could have serious consequences:
Common legal terms are mistranslated due to a lack of specialized training data. For example, “warrant” is often translated as “court order,” which downplays the severity of a legal document.
Contextual misunderstandings lead to serious errors:
- “Due date” was mistranslated as “date to give birth.”
- “Trial” was often translated as “test.”
- “Charged with a battery a case” turned into “loaded with a case of batteries.”
Pronoun confusion creates ambiguity:
- Spanish’s use of “su” (your/his/her/their) is often mistranslated in legal documents, leading to uncertainty about property ownership, responsibility, or court filings.
- In restraining order cases, it was unclear who was accusing whom, which could put victims at risk.
AI can introduce gender biases:
- Words with no inherent gender (e.g., “politician”) are often translated as male.
- The Spanish “Me Maltrata”, which could be translated either as She mistreats me or He mistreats me — without the gender being specified. The machine would default “me maltrata” as “He mistreats me,” potentially distorting evidence in domestic violence cases.
Without human review, these AI-driven errors can go unnoticed, leading to severe legal consequences.
The Dangers of Mistranslation in Legal Interactions
One of the most troubling findings from Llop’s work was the invisible breakdowns in communication between legal providers and non-English speakers.
1. Parallel Conversations Instead of Communication
In many cases, both parties believed they were exchanging information, but in reality:
- Legal providers were missing key facts from litigants.
- Users did not realize that their information was misunderstood or misrepresented.
- Critical details — such as the nature of an abuse claim or financial disclosures — were being lost.
This failure to communicate accurately could result in:
- People choosing the wrong legal recourse and misunderstanding what options are available to them.
- Legal provider staff making decisions based on incomplete or distorted information, providing services and option menus based on misunderstandings about the person’s scenario or preferences.
- Access to justice being compromised for vulnerable litigants.
2. Why a Glossary Isn’t Enough
Some legal institutions have tried to mitigate errors by adding legal glossaries to machine translation tools. However, Llop’s research found that glossary-based corrections do not always solve the problem:
- Example 1: The word “address” was provided to the AI to ensure translation to “mailing address” (instead of “home address”) in one context — but then mistakenly applied when a clerk asked, “What issue do you want to address?”
- Example 2: “Will” (as in a legal document) was mistranslated when applied to the auxiliary verb “will” in regular interactions (“I will send you this form”).
- Example 3: A glossary fix for “due date” worked .
- “Example 4: A glossary fix for “pleading” worked but failed to adjust grammatical structure or pronoun usage.”
These patchwork fixes are not enough. More comprehensive training, oversight, and quality control are needed.
Advancing Legal Language AI: AutoML and Human Review
One promising improvement is AutoML, which allows legal organizations to train machine translation models with their own specialized legal data.
AutoML: A Step Forward, But Still Flawed
Llop’s team worked on an AutoML project by:
- Collecting 8,000+ legal translation pairs from official legal sources that had been translated by experts.
- Correcting AI-generated translations manually.
- Feeding improved translations back into the model.
- Iterating until translations were more accurate.
Results showed that AutoML improved translation quality, but major issues remained:
- AI struggled with conversational context. If a prior sentence referenced “my wife,” but the next message about the wife didn’t specify gender, AI might mistakenly switch the pronoun to “he”.
- AI overfit to common legal phrases, inserting “petition” even when the correct translation should have been “form.”
These challenges highlight why human review is essential.
Real-Time Machine Translation
While text-based AI translation can be refined over time, real-time translation — such as voice-to-text systems in legal offices — presents even greater challenges.
Voice-to-Text Lacks Punctuation Awareness
People do not dictate punctuation, but pauses and commas can change legal meaning. For example:
- “I’m guilty” vs. “I’m not guilty” (missing comma error).
- Minor misspellings or poor grammar can dramatically change a translation.
AI Struggles with Speech Patterns
Legal system users come from diverse linguistic backgrounds, making real-time translation even more difficult. AI performs poorly when users:
- Speak quickly or use filler words (“um,” “huh,” “oh”).
- Have soft speech or heavy accents.
- Use sentence structures influenced by indigenous or regional dialects.
These challenges make it clear that AI faces major challenges in performing accurately in high-stakes legal interactions.
The Need for National Standards and Training Datasets
Llop’s research underscores a critical gap: there are no national standards, training datasets, or quality benchmark datasets for legal translation AI.
A National Legal Translation Project
Llop saw an opportunity for improvement if there were to be:
- A centralized effort to collect high-quality legal translation pairs.
- State-specific localization of legal terms.
- Guidelines for AI usage in courts, legal aid orgs, and other institutions.
Such a standardized dataset could train AI more effectively while ensuring legal accuracy.
Training for English-Only Speakers
English-speaking legal provider staff need training on how to structure their speech for better AI translation:
- Using plain language and short sentences.
- Avoiding vague pronouns (“his, her, their”).
- Confirming meaning before finalizing translations.
AI, Human Oversight, and National Infrastructure in Legal Translation
Machine translation and AI can be useful, but they are far from perfect. Without human review, legal expertise, and national standards, AI-generated translations could compromise access to justice.
Llop’s work highlights the urgent need for:
- Human-in-the-loop AI translation.
- Better training data tailored for legal contexts.
- National standards for AI language access.
As AI continues to evolve, it must be designed with legal precision and human oversight — because in law, a mistranslation can change lives.
Get in touch with Cristina Llop to learn more about her work & vision for better language access: https://www.linkedin.com/in/cristina-llop-75749915/
Thanks to her for terrific, detailed presentation at the AI+A2J Research series. Sign up to come to future Zoom webinars in our series.Find out
more about the Stanford Legal Design Lab’s work on AI & Access to justice here.