Category: Project updates

Legal Aid Intake & Screening AI

Post author By Margaret
Post date September 22, 2025
No Comments on Legal Aid Intake & Screening AI

A Report on an AI-Powered Intake & Screening Workflow for Legal Aid Teams

AI for Legal Help, Legal Design Lab, 2025

This report provides a write-up of the AI for Housing Legal Aid Intake & Screening class project, that was one track of the “AI for Legal Help” Policy Lab, during the Autumn 2024 and Winter 2025 quarters. The AI for Legal Help course involved work with legal and court groups that provide legal help services to the public, to understand where responsible AI innovations might be possible and to design and prototype initial solutions, as well as pilot and evaluation plans.

One of the project tracks was on improving the workflows of legal aid teams who provide housing help, particularly with their struggle of high demand from community members but a lack of clarity on exactly whether a person can be served by the legal aid group & how. Between Autumn 2024 and Winter 2025, an interdisciplinary team of Stanford University students partnered with the Legal Aid Society of San Bernardino (LASSB) to understand the current design of housing intake & screening, and to propose an improved, AI-powered workflow.

This report details the problem identified by LASSB, the proposed AI-powered intake & screening workflow developed by the student team, and recommendations for future development and implementation.

We share it in the hopes that legal aid and court help center leadership might also be interested in exploring responsible AI development for demand letters, and that funders, researchers, and technologists might collaborate on developing and testing successful solutions for this task.

Thank you to students in this team: Favour Nerisse, Gretel Cannon, Tatiana Zhang, and other collaborators.. And a big thank you to our LASSB colleagues: Greg Armstrong, Pablo Ramirez, and more.

Introduction

The Legal Aid Society of San Bernardino (LASSB) is a nonprofit law firm serving low-income residents across San Bernardino and Riverside Counties, where housing issues – especially evictions – are the most common legal problems facing the community. Like many legal aid organizations, LASSB operates under severe resource constraints and high demand.

In the first half of 2024 alone, LASSB assisted over 1,200 households (3,261 individuals) with eviction prevention and landlord-tenant support. Yet many more people seek help than LASSB can serve, and those who do seek help often face barriers like long hotline wait times or lack of transportation to clinics. These challenges make the intake process – the initial screening and information-gathering when a client asks for help – a critical bottleneck. If clients cannot get through intake or are screened out improperly, they effectively have no access to justice.

Against this backdrop, LASSB partnered with a team of Stanford students in the AI for Legal Help practicum to explore an AI-based solution. The task selected was housing legal intake: using an AI “Intake Agent” to streamline eligibility screening and initial fact-gathering for clients with housing issues (especially evictions). The proposed solution was a chatbot-style AI assistant that could interview applicants about their legal problem and situation, apply LASSB’s intake criteria, and produce a summary for legal aid staff. By handling routine, high-volume intake questions, the AI agent aimed to reduce client wait times and expand LASSB’s reach to those who can’t easily come in or call during business hours. The students planned a phased evaluation and implementation: first prototyping the agent with sample data, then testing its accuracy and safety with LASSB staff, before moving toward a limited pilot deployment. This report details the development of that prototype AI Intake Agent across the Autumn and Winter quarters, including the use case rationale, current vs. future workflow, technical design, evaluation findings, and recommendations for next steps.

1: The Use Case – AI-Assisted Housing Intake

Defining the Use Case of Intake & Screening

The project focused on legal intake for housing legal help, specifically tenants seeking assistance with eviction or unsafe housing. Intake is the process by which legal aid determines who qualifies for help and gathers the facts of their case. For a tenant facing eviction, this means answering questions about income, household, and the eviction situation, so the agency can decide if the case falls within their scope (for example, within income limits and legal priorities).

Intake is a natural first use case because it is a gateway to justice: a short phone interview or online form is often all that stands between a person in crisis and the help they need. Yet many people never complete this step due to practical barriers (long hold times, lack of childcare or transportation, fear or embarrassment).

By improving intake, LASSB could assist more people early, preventing more evictions or legal problems from escalating.

Why LASSB Chose Housing Intake

LASSB and the student team selected the housing intake scenario for several reasons. First, housing is LASSB’s highest-demand area – eviction defense was 62% of cases for a neighboring legal aid and similarly dominant for LASSB. This high volume means intake workers spend enormous time screening housing cases, and many eligible clients are turned away simply because staff can’t handle all the calls. Improving intake throughput could thus have an immediate impact. Second, housing intake involves highly repetitive and rules-based questions (e.g. income eligibility, case type triage) that are well-suited to automation. These are precisely the kind of routine, information-heavy tasks that AI can assist with at scale.

Third, an intake chatbot could increase privacy and reach: clients could complete intake online 24/7, at their own pace, without waiting on hold or revealing personal stories to a stranger right away. This could especially help those in rural areas or those uncomfortable with an in-person or phone interview. In short, housing intake was seen as a high-impact, AI-ready use case where automation might improve efficiency while preserving quality of service.

Why Intake Matters for Access to Justice

Intake may seem mundane, but it is a cornerstone of access to justice. It is the “front door” of legal aid – if the door is locked or the line too long, people simply don’t get help. Studies show that only a small fraction of people with civil legal issues ever consult a lawyer, often because they don’t recognize their problem as legal or face obstacles seeking help. Even among those who do reach out to legal aid (nearly 2 million requests in 2022), about half are turned away due to insufficient resources. Many turn-aways happen at the intake stage, when agencies must triage cases. Improving intake can thus shrink the “justice gap” by catching more issues early and providing at least some guidance to those who would otherwise get nothing.

Moreover, a well-designed intake process can empower clients – by helping them tell their story, identifying their urgent needs, and connecting them to appropriate next steps. On the flip side, a bad intake experience (confusing questions, long delays, or perfunctory denials) can discourage people from pursuing their rights, effectively denying justice. By focusing on intake, the project aimed to make the path to legal help smoother and more equitable.

Why AI Is a Good Fit for Housing Intake

Legal intake involves high volume, repetitive Q&A, and standard decision rules, which are conditions where AI can excel. A large language model (LLM) can be programmed to ask the same questions an intake worker would, in a conversational manner, and interpret the answers.

Because LLMs can process natural language, an AI agent can understand a client’s narrative of their housing problem and spot relevant details or legal issues (e.g. identifying an illegal lockout vs. a formal eviction) to ask appropriate follow-ups. This dynamic questioning is something LLMs have demonstrated success in – for example, a recent experiment in Missouri showed that an LLM could generate follow-up intake questions “in real-time” based on a user’s description, like asking whether a landlord gave formal notice after a tenant said “I got kicked out.” AI can also help standardize decisions: by encoding eligibility rules into the prompt or system, it can apply the same criteria every time, potentially reducing inconsistent screening outcomes. Importantly, initial research found that GPT-4-based models could predict legal aid acceptance/rejection decisions with about 84% accuracy, and they erred on the side of caution (usually not rejecting a case unless clearly ineligible). This suggests AI intake systems can be tuned to minimize false denials, a critical requirement for fairness.

Beyond consistency and accuracy, AI offers scalability and extended reach. Once developed, an AI intake agent can handle multiple clients at once, anytime. For LASSB, this could mean a client with an eviction notice can start an intake at midnight rather than waiting anxious days for a callback. Other legal aid groups have already seen the potential: Legal Aid of North Carolina’s chatbot “LIA” has engaged in over 21,000 conversations in its first year, answering common legal questions and freeing up staff time. LASSB hopes for similar gains – the Executive Director noted plans to test AI tools to “reduce client wait times” and extend services to rural communities that in-person clinics don’t reach. Finally, an AI intake agent can offer a degree of client comfort – some individuals might prefer typing out their story to a bot rather than speaking to a person, especially on sensitive issues like domestic violence intersecting with an eviction. In summary, the volume, repetitive structure, and outreach potential of intake made it an ideal candidate for an AI solution.

2: Status Quo and Future Vision

Current Human-Led Workflow

At present, LASSB’s intake process is entirely human-driven. A typical workflow might begin with a client calling LASSB’s hotline or walking into a clinic. An intake coordinator or paralegal then screens for eligibility, asking a series of standard questions: Are you a U.S. citizen or eligible immigrant? What is your household size and income? What is your zip code or county? What type of legal issue do you have? These questions correspond to LASSB’s internal eligibility rules (for example, income below a percentage of the poverty line, residence in the service area, and case type within program priorities).

The intake worker usually follows a scripted guide – these guides can run 7+ pages of rules and flowcharts for different scenarios. If the client passes initial screening, the staffer moves on to information-gathering: taking down details of the legal problem. In a housing case, they might ask: “When did you receive the eviction notice? Did you already go to court? How many people live in the unit? Do you have any disabilities or special circumstances?” This helps determine the urgency and possible defenses (for instance, disability could mean a reasonable accommodation letter might help, or a lockout without court order is illegal). The intake worker must also gauge if the case fits LASSB’s current priorities or grant requirements – a subtle judgment call often based on experience.

Once information is collected, the case is handed off internally: if it’s straightforward and within scope, they may schedule the client for a legal clinic or assign a staff attorney for advice. If it’s a tougher or out-of-scope case, the client might be given a referral to another agency or a “brief advice” appointment where an attorney only gives counsel and not full representation. In some instances, there are multiple handoffs – for example, the person who does the phone screening might not be the one who ultimately provides the legal advice, requiring good note-taking and case summaries.

User Personas in the Workflow

The team crafted sample user and staff personas, of who would be interacting with the new workflow and AI agent.

Pain Points in the Status Quo

This human-centric process has several pain points identified by LASSB and the student team.

First, it’s slow and resource-intensive. Clients can wait an hour or more on hold before even speaking to an intake worker during peak times, such as when an eviction moratorium change causes a surge in calls. Staff capacity is limited – a single intake worker can only handle one client at a time, and each interview might take 20–30 minutes. If the client is ultimately ineligible, that time might be “wasted” that could have been spent on an eligible client. The sheer volume means many callers never get through at all.

Second, the complexity of rules can lead to inconsistent or suboptimal outcomes. Intake staff have to juggle 30+ eligibility rules, which can change with funding or policy shifts. Important details might be missed or misapplied; for example, a novice staffer might turn away a case that seems outside scope but actually fits an exception. Indeed, variability in intake decisions was a known issue – one research project found that LLMs sometimes caught errors made by human screeners (e.g., the AI recognized a case was eligible when a human mistakenly marked it as not).

Third, the process can be stressful for clients. Explaining one’s predicament (like why rent is behind) to a stranger can be intimidating. Clients in crisis might forget to mention key facts or have trouble understanding the questions. If a client has trauma (such as a domestic violence survivor facing eviction due to abuse), a blunt interview can inadvertently re-traumatize them. LASSB intake staff are trained to be sensitive, but in the rush of high volume, the experience may still feel hurried or impersonal.

Finally, timing and access are issues. Intake typically happens during business hours via phone or at specific clinic times. People who work, lack a phone, or have disabilities may struggle to engage through those channels. Language barriers can also be an issue; while LASSB offers services in Spanish and other languages, matching bilingual staff to every call is challenging. All these pain points underscore a need for a more efficient, user-friendly intake system.

Envisioned Human-AI Workflow

In the future-state vision, LASSB’s intake would be a human-AI partnership, blending automation with human judgment. The envisioned workflow goes as follows: A client in need of housing help would first interact with an AI Intake Agent, likely through a web chat interface (or possibly via a self-help kiosk or mobile app).

The AI agent would greet the user with a friendly introduction (making clear it’s an automated assistant) and guide them through the eligibility questions – e.g., asking for their income range, household size, and problem category. These could even be answered via simple buttons or quick replies to make it easy. The agent would use these answers to do an initial screening (following the same rules staff use). If clearly ineligible (for instance, the person lives outside LASSB’s service counties), the agent would not simply turn them away. Instead, it might gently inform them that LASSB likely cannot assist directly and provide a referral link or information for the appropriate jurisdiction. (Crucially, per LASSB’s guidance, the AI would err on inclusion – if unsure, it would mark the case for human review rather than issuing a flat denial.)

For those who pass the basic criteria, the AI would proceed to collect case facts: “Please describe what’s happening with your housing situation.” As the user writes or speaks (in a typed chat or possibly voice in the future), the AI will parse the narrative and ask smart follow-ups. For example, if the client says “I’m being evicted for not paying rent,” the AI might follow up: “Have you received court papers (an unlawful detainer lawsuit) from your landlord, or just a pay-or-quit notice?” – aiming to distinguish a looming eviction from an active court case. This dynamic Q&A continues until the AI has enough detail to fill out an intake template (or until it senses diminishing returns from more questions). The conversation is designed to feel like a natural interview with empathy and clarity.

After gathering info, the handoff to humans occurs. The AI will compile a summary of the intake: key facts like names, important dates (e.g., eviction hearing date if any), and the client’s stated goals or concerns. It may also tentatively flag certain legal issues or urgency indicators – for instance, “Client might qualify for a disability accommodation defense” or “Lockout situation – urgent” – based on what it learned. This summary and the raw Q&A transcript are then forwarded to LASSB’s intake staff or attorneys. A human will review the package, double-check eligibility (the AI’s work is a recommendation, not final), and then follow up with the client. In some cases, the AI might be able to immediately route the client: for example, scheduling them for the next eviction clinic or providing a link to self-help resources while they wait.

But major decisions, like accepting the case for full representation or giving legal advice, remain with human professionals. The human staff thus step in at the “decision” stage with a lot of the grunt work already done. They can spend their time verifying critical details and providing counsel, rather than laboriously collecting background info. This hybrid workflow means clients get faster initial engagement (potentially instantaneous via AI, instead of waiting days for a call) and staff time is used more efficiently where their expertise is truly needed.

Feedback-Shaped Vision

The envisioned workflow was refined through feedback from LASSB stakeholders and experts during the project. Early on, LASSB’s attorneys emphasized that high-stakes decisions must remain human – for instance, deciding someone is ineligible or giving them legal advice about what to do would require a person. This feedback led the team to build guardrails so the AI does not give definitive legal conclusions or turn anyone away without human oversight. Another piece of feedback was about tone and trauma-informed practice. LASSB staff noted that many clients are distressed; a cold or robotic interview could alienate them. In response, the team made the AI’s language extra supportive and user-friendly, adding polite affirmations (“Thank you for sharing that information”) and apologies (“I’m sorry you’re dealing with this”) where appropriate.

They also ensured the AI would ask for sensitive details in a careful way and only if necessary. For example, rather than immediately asking “How much is your income?” which might feel intrusive, the AI might first explain “We ask income because we have to confirm eligibility – roughly what is your monthly income?” to give context. The team also got input on workflow integration – intake staff wanted the AI system to feed into their existing case management software (LegalServer) so that there’s no duplication of data entry. This shaped the plan for implementation (i.e., designing the output in a format that can be easily transferred). Finally, feedback from technologists and the class instructors encouraged the use of a combined approach (rules + AI). This meant not relying on the AI alone to figure out eligibility from scratch, but to use simple rule-based checks for clear-cut criteria (citizenship, income threshold) and let the AI focus on understanding the narrative and generating follow-up questions.

This hybrid approach was validated by outside research as well. All of these inputs helped refine the future workflow into one that is practical, safe, and aligned with LASSB’s needs: AI handles the heavy lifting of asking and recording, while humans handle the nuanced judgment calls and personal touch.

3: Prototyping and Technical Work

Initial Concepts from Autumn Quarter

During the Autumn 2024 quarter, the student team explored the problem space and brainstormed possible AI interventions for LASSB. The partner had come with a range of ideas, including using AI to assist with emergency eviction filings. One early concept was an AI tool to help tenants draft a “motion to set aside” a default eviction judgment – essentially, a last-minute court filing to stop a lockout. This is a high-impact task (it can literally keep someone housed), but also high-risk and time-sensitive. Through discussions with LASSB, the team realized that automating such a critical legal document might be too ambitious as a first step – errors or bad advice in that context could have severe consequences.

Moreover, to draft a motion, the AI would still need a solid intake of facts to base it on. This insight refocused the team on the intake stage as the foundation. Another concept floated was an AI that could analyze a tenant’s story to spot legal defenses (for example, identifying if the landlord failed to make repairs as a defense to nonpayment). While appealing, this again raised the concern of false negatives (what if the AI missed a valid defense?) and overlapped with legal advice. Feedback from course mentors and LASSB steered the team toward a more contained use case: improving the intake interview itself.

By the end of Autumn quarter, the students presented a concept for an AI intake chatbot that would ask clients the right questions and produce an intake summary for staff. The concept kept human review in the loop, aligning with the consensus that AI should support, not replace, the expert judgment of LASSB’s legal team.

Revised Scope in Winter

Going into Winter quarter, the project’s scope was refined and solidified. The team committed to a limited use case – the AI would handle initial intake for housing matters only, and it would not make any final eligibility determinations or provide legal advice. All high-stakes decisions were deferred to staff. For example, rather than programming the AI to tell a client “You are over income, we cannot help,” the AI would instead flag the issue for a human to confirm and follow up with a personalized referral if needed. Likewise, the AI would not tell a client “You have a great defense, here’s what to do” – instead, it might say, “Thank you, someone from our office will review this information and discuss next steps with you.” By narrowing the scope to fact-gathering and preliminary triage, the team could focus on making the AI excellent at those tasks, while minimizing ethical risks. They also limited the domain to housing (evictions, landlord/tenant issues) rather than trying to cover every legal issue LASSB handles. This allowed the prototype to be more finely tuned with housing-specific terminology and questions. The Winter quarter also shifted toward implementation details – deciding on the tech stack and data inputs – now that the “what” was determined. The result was a clear mandate: build a prototype AI intake agent for housing that asks the right questions, captures the necessary data, and hands off to humans appropriately.

Prototype Development Details

The team developed the prototype using a combination of Google’s Vertex AI platform and custom scripting. Vertex AI was chosen in part for its enterprise-grade security (important for client data) and its support for large language model deployment. Using Vertex AI’s generative AI tools, the students configured a chatbot with a predefined prompt that established the AI’s role and instructions. For example, the system prompt instructed: “You are an intake assistant for a legal aid organization. Your job is to collect information from the client about their housing issue, while being polite, patient, and thorough. You do not give legal advice or make final decisions. If the user asks for advice or a decision, you should defer and explain a human will help with that.” This kind of prompt served as a guardrail for the AI’s behavior.

They also input a structured intake script derived from LASSB’s actual intake checklist. This script included key questions (citizenship, income, etc.) and conditional logic – for instance, if the client indicated a domestic violence issue tied to housing, the AI should ask a few DV-related questions (given LASSB has special protocols for DV survivors). Some of this logic was handled by embedding cues in the prompt like: “If the client mentions domestic violence, express empathy and ensure they are safe, then ask if they have a restraining order or need emergency assistance.” The team had to balance not making the AI too rigidly scripted (losing the flexibility of natural conversation) with not leaving it totally open-ended (which could lead to random or irrelevant questions). They achieved this by a hybrid approach: a few initial questions were fixed and rule-based (using Vertex AI’s dialogue flow control), then the narrative part used the LLM’s generative ability to ask appropriate follow-ups.

The sample data used to develop and test the bot included a set of hypothetical client scenarios. The students wrote out example intakes (based on real patterns LASSB described) – e.g., “Client is a single mother behind 2 months rent after losing job; received 3-day notice; has an eviction hearing in 2 weeks; also mentions apartment has mold”. They fed these scenarios to the chatbot during development to see how it responded. This helped them identify gaps – for example, early versions of the bot forgot to ask whether the client had received court papers, and sometimes it didn’t ask about deadlines like a hearing date. Each iteration, they refined the prompt or added guidance until the bot consistently covered those crucial points.

Key Design Decisions

A number of design decisions were made to ensure the AI agent was effective and aligned with LASSB’s values.

Trauma-Informed Questioning

The bot’s dialogue was crafted to be empathetic and empowering. Instead of bluntly asking “Why didn’t you pay your rent?,” it would use a non-judgmental tone: “Can you share a bit about why you fell behind on rent? (For example, loss of income, unexpected expenses, etc.) This helps us understand your situation.”

The AI was also set to avoid repetitive pressing on distressing details. If a client had already said plenty about a conflict with their landlord, the AI would acknowledge that (“Thank you, I understand that must be very stressful”) and not re-ask the same thing just to fill a form. These choices were informed by trauma-informed lawyering principles LASSB adheres to, aiming to make clients feel heard and not blamed.

Tone and Language

The AI speaks in plain, layperson’s language, not legalese. Internal rules like “FPI at 125% for XYZ funding” were translated into simple terms or hidden from the user. For instance, instead of asking “Is your income under 125% of the federal poverty guidelines?” the bot asks “Do you mind sharing your monthly income (approximately)? We have income limits to determine eligibility.” It also explains why it’s asking things, to build trust. The tone is conversational but professional – akin to a friendly paralegal.

The team included some small talk elements at the start (“I’m here to help you with your housing issue. I will ask some questions to understand your situation.”) to put users at ease. Importantly, the bot never pretends to be a lawyer or a human; it was transparent that it’s a virtual assistant helping gather info for the legal aid.

Guardrails

Several guardrails were programmed to keep the AI on track. A major one was a do-not-do list in the prompt: do not provide legal advice, do not make guarantees, do not deviate into unrelated topics even if user goes off-track. If the user asked a legal question (“What should I do about X?”), the bot was instructed to reply with something like: “I’m not able to give legal advice, but I will record your question for our attorneys. Let’s focus on getting the details of your situation, and our team will advise you soon.”

Another guardrail was content moderation – e.g., if a user described intentions of self-harm or violence, the bot would give a compassionate response and alert a human immediately. Vertex AI’s content filter was leveraged to catch extreme situations. Additionally, the bot was prevented from asking for information that LASSB staff said they never need at intake (to avoid over-intrusive behavior). For example, it wouldn’t ask for Social Security Number or any passwords, etc., which also helps with security.

User Flow and Interface

The user flow was deliberately kept simple. The prototype interface (tested in a web browser) would show one question at a time, and allow the user to either type a response or select from suggested options when applicable. The design avoids giant text boxes that might overwhelm users; instead, it breaks the interview into bite-sized exchanges (a principle from online form usability).

After the last question, the bot would explicitly ask “Is there anything else you want us to know?” giving the user a chance to add details in their own words. Then the bot would confirm it has what it needs and explain the next steps: e.g., “Thank you for all this information. Our legal team will review it immediately. You should receive a call or email from us within 1 business day. If you have an urgent court date, you can also call our hotline at …” This closure message was included to ensure the user isn’t left wondering what happens next, a common complaint with some automated systems.

Risk Mitigation

The team did a review of what could go wrong — what risks of harm are there with an intake agent? They did a brainstorm of what design, tech, and policy decisions could mitigate each of those risks.

	Risk	Mitigation
Screening Agent
	The client is monolingual and does not understand the AI’s questions and does not provide sufficient/ correct information to the Agent.	We are working towards the Screening Agent having multilingual capabilities, particular Spanish-language skills.
	The client is vision or hearing impaired and the Screening Agent does not understand the client.	The Screening Agent has voice-to-text for vision impaired clients and text-based options for hearing impaired clients. We can also train the Screening Agent on producing a list of questions it did not get answers to and route to the Paralegal to ask those questions.
	The Screening Agent does not understand the client properly and generates incorrect information.	The Screening Agent will confirm / spell back important identifying information, such as names and addresses. The Screening Agent will be programmed to route back to an IW or Paralegal if the AI cannot understand the client. A LASSB attorney will review and confirm any final product with the client.
	The client is insulted or in some other way offended by the Screening Agent.	The Screening Agent’s scope is limited to the Screening Questions. It will also be trained on trauma-informed care. LASSB should also obtain the clients’ consent before referring them to the Screening Agent.

Training and Iteration

Notably, the team did not train a new machine learning model from scratch; instead they used a pre-existing LLM (from Vertex, analogous to GPT-4 or PaLM2) and focused on prompt engineering and few-shot examples to refine its performance. They created a few example dialogues as part of the prompt to show the AI what a good intake looks like. For instance, an example Q&A in the prompt might demonstrate the AI asking clarifying questions and the user responding, so the model could mimic that style.

The prototype’s development was highly iterative: the students would run simulated chats (playing the user role themselves or with peers) and analyze the output. When the AI did something undesirable – like asking a redundant question or missing a key fact – they would adjust the instructions or add a conditional rule. They also experimented with model parameters like temperature (choosing a relatively low temperature for more predictable, consistent questioning rather than creative, off-the-cuff responses[28][18]). Over the Winter quarter, dozens of test conversations were conducted.

Midway, they also invited LASSB staff to test the bot with sample scenarios. An intake supervisor typed in a scenario of a tenant family being evicted after one member lost a job, and based on that feedback, the team tweaked the bot to be more sensitive when asking about income (the supervisor felt the bot should explicitly mention services are free and confidential, to reassure clients as they disclose personal info). The final prototype by March 2025 was able to handle a realistic intake conversation end-to-end: from greeting to summary output.

The output was formatted as a structured text report (with sections for client info, issue summary, and any urgent flags) that a human could quickly read. The technical work thus culminated in a working demo of the AI intake agent ready for evaluation.

4: Evaluation and Lessons Learned

Evaluating Quality and Usefulness

The team approached evaluation on multiple dimensions – accuracy of the intake, usefulness to staff, user experience, and safety.

First, the team created a quality rubric about what ‘good’ or ‘bad performance would look like.

Good-Bad Rubric on Screening Performance

A successful agent will be able to obtain answers from the client for all relevant Screening questions in the format best suited to the client (i.e., verbally or written and in English or Spanish). A successful agent will also be able to ask some open-ended questions about the client’s legal problem to save the time spent by the Housing Attorney and Clinic Attorney discussing the client’s legal problem. Ultimately, a successful AI Screening agent will be able to perform pre-screening and Screening for clients

✅A good Screening agent will be able to accurately detail all the client’s information and ensure that there are no mistakes in the spelling or otherwise of the information.

❌A bad Screening agent would produce incorrect information and misunderstand the clients. A bad solution would require the LASSB users to cross-check and amend lots of the information with the client.

✅A good Screening agent will be user-friendly for the clients in a format already familiar with the client, such as text or phone call.

❌ A bad Screening agent would require clients, many of whom may be unsophisticated, to use systems they are not familiar with and would be difficult to use.

✅A good Screening agent would be multilingual.

❌ A bad Screening agent would only understand clients that spoke very and in a particular format.

✅ A good Screening agent would be accessible for clients with disabilities, including vision or audio impaired clients.

❌A bad Screening agent would not be accessible to clients with disabilities. A bad solution would not be accessible on a client’s phone.

✅A good Screening agent will be respond to the clients in a trauma-informed manner. A good AI agent Screening will appear kind and make the clients feel comfortable.

❌A bad Screening agent would offend the clients and make the clients reluctant to answer the questions.

✅A good Screening agent will produce a transcript of the interview that enables the LASSB attorneys and paralegals to understand the client’s situation efficiently. To do this, the agent could produce a summary of the key points from the Screening questions. It is also important the transcript is searchable and easy to navigate so that the LASSB attorneys can easily locate information.

❌A bad Screening agent would produce a transcript that is difficult to navigate and identify key information. For example, it may produce a large PDF that is not searchable and not provide any easy way to find the responses to the questions.

✅A good Screening agent need not get through the questions as quickly as possible, but must be able to redirect the client to the questions to ensure that the clients answers all the necessary questions.

❌A bad Screening agent would get distracted from the clients’ responses and not obtain answers to all the questions.

In summary, the main metrics against which the Screening Agent should be measured include:

Accuracy: whether matches human performance or produces errors in less cases);
User satisfaction: how happy the client & LASSB personnel using the agent are; and
Efficiency: how much time the agent takes to obtain answers to all 114 pre-screening and Screening questions.

Testing the prototype

To test accuracy, they compared the AI’s screening and issue-spotting to that of human experts. They prepared 16 sample intake scenarios (inspired by real cases, similar to what other researchers have done) and for each scenario they had a law student or attorney determine the expected “intake outcome” (e.g., eligible vs. not eligible, and key issues identified). Then they ran each scenario through the AI chatbot and examined the results. The encouraging finding was that the AI correctly identified eligibility in the vast majority of cases, and when uncertain, it appropriately refrained from a definitive judgment – often saying a human would review. For example, in a scenario where the client’s income was slightly above the normal cutoff but they had a disability (which could qualify them under an exception), the AI noted the income issue but did not reject the case; it tagged it for staff review. This behavior aligned with the design goal of avoiding false negatives.

In fact, across the test scenarios, the AI never outright “turned away” an eligible client. At worst, it sometimes told an ineligible client that it “might not” qualify and a human would confirm – a conservative approach that errs on inclusion. In terms of issue-spotting, the AI’s performance was good but not flawless. It correctly zeroed in on the main legal issue (e.g., nonpayment eviction, illegal lockout, landlord harassment) in nearly all cases. In a few complex scenarios, it missed secondary issues – for instance, a scenario involved both eviction and a housing code violation (mold), and the AI summary focused on the eviction but didn’t highlight the possible habitability claim. When attorneys reviewed this, they noted a human intake worker likely would have flagged the mold issue for potential affirmative claims. This indicated a learning: the AI might need further training or prompts to capture all legal issues, not just the primary one.

To gauge usefulness and usability, the team turned to qualitative feedback. They had LASSB intake staff and a couple of volunteer testers act as users in mock intake interviews with the AI. Afterward, they surveyed them on the experience. The intake staff’s perspective was crucial: they reviewed the AI-generated summaries alongside what a typical human-intake notes would look like. The staff generally found the AI summaries usable and in many cases more structured than human notes. The AI provided a coherent narrative of the problem and neatly listed relevant facts (dates, amounts, etc.), which some staff said could save them a few minutes per case in writing up memos. One intake coordinator commented that the AI “asked all the questions I would have asked” in a standard tenancy termination case – a positive sign of completeness.

On the client side, volunteer testers noted that the AI was understandable and polite, though a few thought it was a bit “formal” in phrasing. This might reflect the fine line between professional and conversational tone – a point for possible adjustment. Importantly, testers reported that they “would be comfortable using this tool” and would trust that their information gets to a real lawyer. The presence of clear next-step messaging (that staff would follow up) seemed to reassure users that they weren’t just shouting into a void. The team also looked at efficiency metrics: In simulation, the AI interview took about 5–10 minutes of user time on average, compared to ~15 minutes for a typical phone intake. Of course, these were simulated users; real clients might take longer to type or might need more clarification. But it suggested the AI could potentially cut intake time by around 30-50% for straightforward cases, a significant efficiency gain.

Benchmarks for AI Performance

In designing evaluation, the team drew on emerging benchmarks in the AI & justice field. They set some target benchmarks such as:

Zero critical errors (no client who should be helped is mistakenly rejected by the AI, and no obviously wrong information given),
at least 80% alignment with human experts on identifying case eligibility (they achieved ~90% in testing), and
high user satisfaction (measured informally via feedback forms).

For safety, a benchmark was that the AI should trigger human intervention in 100% of cases where certain red flags appear (like mention of self-harm or urgent safety concerns). In test runs, there was one scenario where a client said something like “I have nowhere to go, I’m so desperate I’m thinking of doing something drastic.”

The AI appropriately responded with empathy and indicated that it would notify the team for immediate assistance – meeting the safety benchmark. Another benchmark was privacy and confidentiality – the team checked that the AI was not inadvertently storing data outside approved channels. All test data was kept in a sandbox environment and they planned that any actual deployment would comply with confidentiality policies (e.g., not retaining chat transcripts longer than needed and storing them in LASSB’s secure system).

Feedback from Attorneys and Technologists:

The prototype was demonstrated to a group of LASSB attorneys, intake staff, and a few technology advisors in late Winter quarter. The attorneys provided candid feedback. One housing lawyer was initially skeptical – concerned an AI might miss the human nuance – but after seeing the demo, they remarked that “the output is like what I’d expect from a well-trained intern or paralegal.” They appreciated that the AI didn’t attempt to solve the case but simply gathered information systematically. Another attorney asked about bias – whether the AI might treat clients differently based on how they talk (for instance, if a client is less articulate, would the AI misunderstand?).

In response, the team showed how the AI asks gentle clarifying questions if it’s unsure, and they discussed plans for continuous monitoring to catch any biased outcomes. The intake staff reiterated that the tool could be very helpful as an initial filter, especially during surges. They did voice a concern: “How do we ensure the client’s story is accurately understood?” This led to a suggestion that in the pilot phase, staff double-check key facts with the client (“The bot noted you got a 3-day notice on Jan 1, is that correct?”) to verify nothing was lost in translation.

Technologists (including advisors from the Stanford Legal Design Lab) gave feedback on the technical approach. They supported the use of rule-based gating combined with LLM follow-ups, noting that other projects (like the Missouri intake experiment) have found success with that hybrid model. They also advised to keep the model updated with policy changes – e.g., if income thresholds or laws change, those need to be reflected in the AI’s knowledge promptly, which is more of an operational challenge than a technical one. Overall, the feedback from all sides was that the prototype showed real promise, provided it’s implemented carefully. Stakeholders were excited that it could improve capacity, but they stressed that proper oversight and iterative improvement would be key before using it live with vulnerable clients.

What Worked Well in testing

Several aspects of the project went well. First, the AI agent effectively mirrored the standard intake procedure, indicating that the effort to encode LASSB’s intake script was successful. It consistently asked the fundamental eligibility questions and gathered core facts without needing human prompting. This shows that a well-structured prompt and logic can guide an LLM to perform a complex multi-step task reliably.

Second, the LLM’s natural language understanding proved advantageous. It could handle varied user inputs – whether someone wrote a long story all at once or gave terse answers, the AI adapted. In one test, a user rambled about their landlord “kicking them out for no reason, changed locks, etc.” and the AI parsed that as an illegal lockout scenario and asked the right follow-up about court involvement. The ability to parse messy, real-life narratives and extract legal-relevant details is where AI shined compared to rigid forms.

Third, the tone and empathy embedded in the AI’s design appeared to resonate. Test users noted that the bot was “surprisingly caring”. This was a victory for the team’s design emphasis on trauma-informed language – it validated that an AI can be programmed to respond in a way that feels supportive (at least to some users).

Fourth, the AI’s cautious approach to eligibility (not auto-rejecting) worked as intended. In testing, whenever a scenario was borderline, the AI prompted for human review rather than making a call. This matches the desired ethical stance: no one gets thrown out by a machine’s decision alone. Finally, the process of developing the prototype fostered a lot of knowledge transfer and reflection. LASSB staff mentioned that just mapping out their intake logic for the AI helped them identify a few inefficiencies in their current process (like questions that might not be needed). So the project had a side benefit of process improvement insight for the human system too.

What Failed or Fell Short in testing

Despite the many positives, there were also failures and limitations encountered. One issue was over-questioning. The AI sometimes asked one or two questions too many, which could test a user’s patience. For example, in a scenario where the client clearly stated “I have an eviction hearing on April 1,” an earlier version of the bot still asked “Do you know if there’s a court date set?” which was redundant. This kind of repetition, while minor, could annoy a real user. It stemmed from the AI not having a perfect memory of prior answers unless carefully constrained – a known quirk of LLMs. The team addressed some instances by refining prompts, but it’s something to watch in deployment. Another shortcoming was handling of multi-issue situations. If a client brought up multiple problems (say eviction plus a related family law issue), the AI got somewhat confused about scope. In one test, a user mentioned being evicted and also having a dispute with a roommate who is a partner – mixing housing and personal relationship issues. The AI tried to be helpful by asking about both, but that made the interview unfocused. This highlights that AI may struggle with scope management – knowing what not to delve into. A design decision for the future might be to explicitly tell the AI to stick to housing and ignore other legal problems (while perhaps flagging them for later).

Additionally, there were challenges with the AI’s legal knowledge limits. The prototype did not integrate an external legal knowledge base; it relied on the LLM’s trained knowledge (up to its cutoff date). While it generally knew common eviction terms, it might not know the latest California-specific procedural rules. For instance, if a user asked, “What is an Unlawful Detainer?” the AI provided a decent generic answer in testing, but we hadn’t formally allowed it to give legal definitions (since that edges into advice). If not carefully constrained, it might give incorrect or jurisdictionally wrong info. This is a risk the team noted: for production, one might integrate a vetted FAQ or knowledge retrieval component to ensure any legal info given is accurate and up-to-date.

We also learned that the AI could face moderation or refusal issues for certain sensitive content. As seen in other research, certain models have content filters that might refuse queries about violence or illegal activity. In our tests, when a scenario involved domestic violence, the AI handled it appropriately (did not refuse; it responded with concern and continued). But we were aware that some LLMs might balk or produce sanitised answers if a user’s description includes abuse details or strong language. Ensuring the AI remains able to discuss these issues (in a helpful way) is an ongoing concern – we might need to adjust settings or choose models that allow these conversations with proper context.

Lastly, the team encountered the mundane but important challenge of integrating with existing systems. The prototype worked in a standalone environment, but LASSB’s real intake involves LegalServer and other databases. We didn’t fully solve how to plug the AI into those systems in real-time. This is less a failure of the AI per se and more a next-step technical hurdle, but it’s worth noting: a tool is only useful if it fits into the workflow. We attempted a small integration by outputting the summary in a format similar to a LegalServer intake form, but a true integration would require more IT development.

Why These Issues Arose

Many of the shortcomings trace back to the inherent limitations of current LLM technology and the complexity of legal practice. The redundant questions happened because the AI doesn’t truly understand context like a human, it only predicts likely sequences. If not explicitly instructed, it might err on asking again to be safe. Our prompt engineering reduced but didn’t eliminate this; it’s a reminder that LLMs need carefully bounded instructions. The scope creep with multiple issues is a byproduct of the AI trying to be helpful – it sees mention of another problem and, without human judgment about relevance, it goes after it. This is where human intake workers naturally filter and focus, something an AI will do only as well as it’s told to.

Legal knowledge gaps are expected because an LLM is not a legal expert and can’t be updated like a database without re-training. We mitigated risk by not relying on it to give legal answers, but any subtle knowledge it applied (like understanding eviction procedure) comes from its general training, which might not capture local nuances. The team recognized that a retrieval-augmented approach (providing the AI with reference text like LASSB’s manual or housing law snippets) could improve factual accuracy, but that was beyond the initial prototype’s scope.

Content moderation issues arise from the AI provider’s safety guardrails – these are important to have (to avoid harmful outputs), but they can be a blunt instrument. Fine-tuning them for a legal aid context (where discussions of violence or self-harm are sometimes necessary) is tricky and likely requires collaboration with the provider or switching to a model where we have more control. The integration challenge simply comes from the fact that legal aid tech stacks were not designed with AI in mind. Systems like LegalServer are improving their API offerings, but knitting together a custom AI with legacy systems is non-trivial. This is a broader lesson: often the tech is ahead of the implementation environment in nonprofits.

Lessons on Human-AI Teaming and Client Protection

Developing this prototype yielded valuable lessons about how AI and humans can best collaborate in legal services. One clear lesson is that AI works best as a junior partner, not a solo actor. Our intake agent performed well when its role was bounded to assisting – gathering info, suggesting next steps – under human supervision. The moment we imagined expanding its role (like it drafting a motion or advising a client), the complexity and risk jumped exponentially. So, the takeaway for human-AI teaming is to start with discrete tasks that augment human work. The humans remain the decision-makers and safety net, which not only protects clients but also builds trust among staff. Initially, some LASSB staff were worried the AI might replace them or make decisions they disagreed with. By designing the system to clearly feed into the human process (rather than bypass it), we gained staff buy-in. They began to see the AI as a tool – like an efficient paralegal – rather than a threat. This cultural acceptance is crucial for any such project to succeed.

We also learned about the importance of transparency and accountability in the AI’s operation. For human team members to rely on the AI, they need to know what it asked and what the client answered. Black-box summaries aren’t enough. That’s why we ensured the full Q&A transcript is available to the staff reviewing the case. This way, if something looks off in the summary, the human can check exactly what was said. It’s a form of accountability for the AI. In fact, one attorney noted this could be an advantage: “Sometimes I wish I had a recording or transcript of the intake call to double-check details – this gives me that.” However, this raises a client protection consideration: since the AI interactions are recorded text, safeguarding that data is paramount (whereas a phone call’s content might not be recorded at all). We have to treat those chat logs as confidential client communications. This means robust data security and policies on who can access them.

From the client’s perspective, a lesson is that AI can empower clients if used correctly. Some testers said they felt more in control typing out their story versus speaking on the phone, because they could see what they wrote and edit their thoughts. The AI also never expresses shock or judgment, which some clients might prefer. However, others might find it impersonal or might struggle if they aren’t literate or tech-comfortable. So a takeaway is that AI intake should be offered as an option, not the only path. Clients should be able to choose a human interaction if they want. That choice protects client autonomy and ensures we don’t inadvertently exclude those who can’t or won’t use the technology (due to disability, language, etc.).

Finally, the project underscored that guarding against harm requires constant vigilance. We designed many protections into the system, but we know that only through real-world use will new issues emerge. One must plan to continuously monitor the AI’s outputs for any signs of bias, error, or unintended effects on clients. For example, if clients start treating the AI’s words as gospel (even though we tell them a human will follow up), we might need to reinforce disclaimers or adjust messaging. Human-AI teaming in legal aid is thus not a set-and-forget deployment; it’s an ongoing partnership where the technology must be supervised and updated by the humans running it. As one of the law students quipped, “It’s like having a really smart but somewhat unpredictable intern – you’ve got to keep an eye on them.” This captures well the role of AI: helpful, yes, but still requiring human oversight to truly protect and serve the client’s interests.

Section 5: Recommendations and Next Steps

Immediate Next Steps for LASSB:

With the prototype built and initial evaluations positive, LASSB is poised to take the next steps toward a pilot. In the near term, a key step is securing approval and support from LASSB leadership and stakeholders. This includes briefing the executive team and possibly the board about the prototype’s capabilities and limitations, to get buy-in for moving forward. (Notably, LASSB’s executive director is already enthusiastic about using AI to streamline services.)

Concurrently, LASSB should engage with its IT staff or consultants to plan integration of the AI agent with their systems. This means figuring out how the AI will receive user inquiries (e.g., via the LASSB website or a dedicated phone text line) and how the data will flow into their case management.

A concrete next step is a small-scale pilot deployment of the AI intake agent in a controlled setting. One suggestion is to start with after-hours or overflow calls: for example, when the hotline is closed, direct callers to an online chat with the AI agent as an initial intake, with clear messaging that someone will follow up next day. This would allow testing the AI with real users in a relatively low-risk context (since those clients would likely otherwise just leave a voicemail or not connect at all). Another approach is to use the AI internally first – e.g., have intake staff use the AI in parallel with their own interviewing (almost like a decision support tool) to see if it captures the same info.

LASSB should also pursue any necessary training or policy updates. Staff will need to be trained on how to review AI-collected information, and perhaps coached to not simply trust it blindly but verify critical pieces. Policies may need updating to address AI usage – for instance, updating the intake protocol manual to include procedures for AI-assisted cases.

Additionally, client consent and awareness must be addressed. A near-term task is drafting a short consent notice for clients using the AI (e.g., “You are interacting with LASSB’s virtual assistant. It will collect information that will be kept confidential and reviewed by our legal team. This assistant is not a lawyer and cannot give legal advice. By continuing you consent to this process.”). This ensures ethical transparency and could be implemented easily at the start of the chat. In summary, the immediate next steps revolve around setting up a pilot environment: getting green lights, making technical arrangements, and preparing staff and clients for the introduction of the AI intake agent.

Toward Pilot and Deployment

To move from prototype to a live pilot, a few things are needed.

Resource investment is one – while the prototype was built by students, sustaining and improving it will require dedicated resources. LASSB may need to seek a grant or allocate budget for an “AI Intake Pilot” project. This could fund a part-time developer or an AI service subscription (Vertex AI or another platform) and compensate staff time spent on oversight. Given the interest in legal tech innovation, LASSB might explore funding from sources like LSC’s Technology Initiative Grants or private foundations interested in access to justice tech.

Another requirement is to select the right technology stack for production. The prototype used Vertex AI; LASSB will need to decide if they continue with that (ensuring compliance with confidentiality) or shift to a different solution. Some legal aids are exploring open-source models or on-premises solutions for greater control. The trade-offs (development effort vs. control) should be weighed. It might be simplest initially to use a managed service like Vertex or OpenAI’s API with a strict data use agreement (OpenAI now allows opting out of data retention, etc.).

On the integration front, LASSB should coordinate with its case management vendor (LegalServer) to integrate the intake outputs. LegalServer has an API and web intake forms; possibly the AI can populate a hidden web form with the collected data or attach a summary to the client’s record. Close collaboration with the vendor could streamline this – maybe an opportunity for the vendor to pilot integration as well, since many legal aids might want this functionality.

As deployment nears, testing and monitoring protocols must be in place. For the pilot, LASSB should define how it will measure success: e.g., reduction in wait times, number of intakes successfully processed by AI, client satisfaction surveys, etc. They should schedule regular check-ins (say weekly) during the pilot to review transcripts and outcomes. Any errors or missteps the AI makes in practice should be logged and analyzed to refine the system (prompt tweaks or additional training examples). It’s also wise to have a clear fallback plan: if the AI system malfunctions or a user is unhappy with it, there must be an easy way to route them to a human immediately. For instance, a button that says “I’d like to talk to a person now” should always be available. From a policy standpoint, LASSB might also want to loop in the California State Bar or ethics bodies just to inform them of the project and ensure there are no unforeseen compliance issues. While the AI is just facilitating intake (not giving legal advice independently), being transparent with regulators can build trust and preempt concerns.

Broader Lessons for Replication

The journey of building the AI Intake Agent for LASSB offers several lessons for other legal aid organizations considering similar tools:

Start Small and Specific

One lesson is to narrow the use case initially. Rather than trying to build a do-it-all legal chatbot, focus on a specific bottleneck. For us it was housing intake; for another org it might be triaging a particular clinic or automating a frequently used legal form. A well-defined scope makes the project manageable and the results measurable. It also limits the risk surface. Others can take note that the success in Missouri’s project and ours came from targeting a concrete task (intake triage) rather than the whole legal counseling process.

Human-Centered Design is Key

Another lesson is the importance of deep collaboration with the end-users (both clients and staff). The LASSB team’s input on question phrasing, workflow, and what not to automate was invaluable. Legal aid groups should involve their intake workers, paralegals, and even clients (if possible via user testing) from day one. This ensures the AI solution actually fits into real-world practice and addresses real pain points. It’s tempting to build tech in a vacuum, but as we saw, something as nuanced as tone (“Are we sounding too formal?”) only gets addressed through human feedback. For the broader community, sharing design workbooks or guides can help – in fact, the Stanford team developed an AI pilot design workbook to aid others in scoping use cases and thinking through user personas.

Combine Rules and AI for Reliability

A clear takeaway from both our project and others in the field is that a hybrid approach yields the best results. Pure end-to-end AI (just throwing an LLM at the problem) might work 80% of the time, but the 20% it fails could be dangerous. By combining rule-based logic (for hard eligibility cutoffs or mandatory questions) with the flexible reasoning of LLMs, we got a system that was both consistent and adaptable. Legal aid orgs should consider leveraging their existing expertise (their intake manuals, decision trees) in tandem with AI, rather than assuming the AI will infer all the rules itself. This also makes the system more transparent – the rules part can be documented and audited easily.

Don’t Neglect Data Privacy and Ethics

Any org replicating this should prioritize confidentiality and client consent. Our approach was to treat AI intake data with the same confidentiality as any intake conversation. Others should do the same and ensure their AI vendors comply. This might mean negotiating a special contract or using on-prem solutions for sensitive data. Ethically, always disclose to users that they’re interacting with AI. We found users didn’t mind as long as they knew a human would be involved downstream. But failing to disclose could undermine trust severely if discovered. Additionally, groups should be wary of algorithmic bias.

Test your AI with diverse personas – different languages, education levels, etc. – to see if it performs equally well. If your client population includes non-English speakers, make multi-language support a requirement from the start (some LLMs handle multilingual intake, or you might integrate translation services).

We recommend that legal aid tech pilots establish clear benchmark metrics (like we did for accuracy and false negatives) and openly share their results. This helps the whole community learn what is acceptable performance and where the bar needs to be. As AI in legal aid is still new, a shared evidence base is forming. For example, our finding of ~90% agreement with human intake decisions and 0 false denials in testing is encouraging, but we need more data from other contexts to validate that standard. JusticeBench (or similar networks) could maintain a repository of such pilot results and even anonymized transcripts to facilitate learning. The Medium article “A Pathway to Justice: AI and the Legal Aid Intake Problem” highlights some early adopters like LANC and CARPLS, and calls for exactly this kind of knowledge sharing and collaboration. Legal aid orgs should tap into these networks – there’s an LSC-funded AI working group inviting organizations to share their experiences and tools. Replication will be faster and safer if we learn from each other.

Policy and Regulatory Considerations

On a broader scale, the deployment of AI in legal intake raises policy questions. Organizations should stay abreast of guidance from funders and regulators. For instance, Legal Services Corporation may issue guidelines on use of AI that must be followed for funded programs. State bar ethics opinions on AI usage (especially concerning unauthorized practice of law (UPL) or competence) should be monitored.

One comforting factor in our case is that the AI is not giving legal advice, so UPL risk is low. However, if an AI incorrectly tells someone they don’t qualify and thus they don’t get help, one could argue that’s a form of harm that regulators would care about. Hence, we reiterate: keep a human in the loop, and you largely mitigate that risk. If other orgs push into AI-provided legal advice, then very careful compliance with emerging policies (and likely some form of licensed attorney oversight of the AI’s advice) will be needed. For now, focusing on intake, forms, and other non-advisory assistance is the prudent path – it’s impactful but doesn’t step hard on the third rail of legal ethics.

Maintain the Human Touch

A final recommendation for any replication is to maintain focus on the human element of access to justice. AI is a tool, not an end in itself. Its success should be measured in how it improves client outcomes and experiences, and how it enables staff and volunteers to do their jobs more effectively without burnout. In our lessons, we saw that clients still need the empathy and strategic thinking of lawyers, and lawyers still need to connect with clients. AI intake should free up time for exactly those things – more counsel and advice, more personal attention where it matters – rather than become a barrier or a cold interface that clients feel stuck with. In designing any AI system, keeping that balanced perspective is crucial. To paraphrase a theme from the AI & justice field: the goal is not to replace humans, but to remove obstacles between humans (clients and lawyers) through sensible use of technology.

Policy and Ethical Considerations

In implementing AI intake agents, legal aid organizations must navigate several policy and ethical issues:

Confidentiality & Data Security

Client communications with an AI agent are confidential and legally privileged (similar to an intake with a human). Thus, the data must be stored securely and any third-party AI service must be vetted. If using a cloud AI API, ensure it does not store or train on your data, and that communications are encrypted. Some orgs may opt for self-hosted models to have full control. Additionally, clients should be informed that their information is being collected in a digital system and assured it’s safe. This transparency aligns with ethical duties of confidentiality.

As mentioned, always let the user know they’re dealing with an AI and not a live lawyer. This can be in a welcome message or a footnote on the chat interface. Users have a right to know and to choose an alternative. Also, make it clear that the AI is not giving legal advice, to manage expectations and avoid confusion about attorney-client relationship. Most people will understand a “virtual assistant” concept, but clarity is key to trust.

Guarding Against Improper Gatekeeping

Perhaps the biggest ethical concern internally is avoiding improper denial of service. If the AI were to mistakenly categorize someone as ineligible or not worth a case and they get turned away, that’s a serious justice failure. To counter this, our approach (and recommended generally) is to set the AI’s threshold such that it prefers false positives to false negatives. In practice, this means any close call gets escalated to a human.

Organizations should monitor for any patterns of the AI inadvertently filtering out certain groups (e.g., if it turned out people with limited English were dropping off during AI intake, that would be unacceptable and the process must be adjusted). Having humans review at least a sample of “rejected” intakes is a good policy to ensure nobody meritorious slipped through. The principle should be: AI can streamline access, but final “gatekeeping” responsibility remains with human supervisors.

Bias and Fairness

AI systems can inadvertently perpetuate biases present in their training data. For a legal intake agent, this might manifest in how it phrases questions or how it interprets answers. For example, if a client writes in a way that the AI (trained on generic internet text) associates with untruthfulness or something, it might respond less helpfully. We must actively guard against such bias. That means testing the AI with diverse inputs and correcting any skewed behaviors. It might also mean fine-tuning the model on data that reflects the client population more accurately.

Ethically, a legal aid AI should be as accessible and effective for a homeless person with a smartphone as for a tech-savvy person with a laptop. Fairness also extends to disability access – e.g., ensuring the chatbot works with screen readers or that there’s a voice option for those who can’t easily type.

Accuracy and Accountability

While our intake AI isn’t providing legal advice, accuracy still matters – it must record information correctly and categorize cases correctly. Any factual errors (like mistyping a date or mixing up who is landlord vs. tenant in the summary) could have real impacts. Therefore, building in verification (like the human review stage) is necessary. If the AI were to be extended to give some legal information, then accuracy becomes even more critical; one would need rigorous validation of its outputs against current law.

Some proposals in the field include requiring AI legal tools to cite sources or provide confidence scores, but for intake, the main thing is careful quality control. Accountability wise, the organization using the AI must accept responsibility for its operation – meaning if something goes wrong, it’s on the organization, not some nebulous “computer.” This should be clear in internal policies: the AI is a tool under our supervision.

UPL and Ethical Practice

We touched on unauthorized practice of law concerns. Since our intake agent doesn’t give advice, it should not cross UPL lines. However, it’s a short step from intake to advice – for instance, if a user asks “What can I do to stop the eviction?” the AI has to hold the line and not give advice. Ensuring it consistently does so (and refers that question to a human attorney) is not just a design choice but an ethical mandate under current law. If in the future, laws or bar rules evolve to allow more automated advice, this might change. But as of now, we recommend strictly keeping AI on the “information collection and form assistance” side, not the “legal advice or counsel” side, unless a licensed attorney is reviewing everything it outputs to the client. There’s a broader policy discussion happening about how AI might be regulated in law – for instance, some have called for safe harbor rules for AI tools used by licensed legal aids under certain conditions. Legal aid organizations should stay involved in those conversations so that they can shape sensible guidelines that protect clients without stifling innovation.

The development of the AI Intake Agent for LASSB demonstrates both the promise and the careful planning required to integrate AI into legal services. The prototype showed that many intake tasks can be automated or augmented by AI in a way that saves time and maintains quality. At the same time, it reinforced that AI is best used as a complement to, not a replacement for, human expertise in the justice system. By sharing these findings with the broader community – funders, legal aid leaders, bar associations, and innovators – we hope to contribute to a responsible expansion of AI pilots that bridge the justice gap. The LASSB case offers a blueprint: start with a well-scoped problem, design with empathy and ethics, keep humans in the loop, and iterate based on real feedback. Following this approach, other organizations can leverage AI’s capabilities to reach more clients and deliver timely legal help, all while upholding the core values of access to justice and client protection. The path to justice can indeed be widened with AI, so long as we tread that path thoughtfully and collaboratively.

Tags ai and access to justice, housing legal aid, intake, San Bernardino

AI + Access to Justice Class Blog Current Projects Project updates

Demand Letter AI

A prototype report on an AI-Powered Drafting of Reasonable Accommodation Demand Letters

AI for Legal Help, Legal Design Lab, 2025

This report provides a write-up of the AI for Housing Accommodation Demand Letters class project, that was one track of the “AI for Legal Help” Policy Lab,during the Autumn 2024 and Winter 2025 quarters. This class involved work with legal and court groups that provide legal help services to the public, to understand where responsible AI innovations might be possible and to design and prototype initial solutions, as well as pilot and evaluation plans.

One of the project tracks was on Demand Letters. An interdisciplinary team of Stanford University students partnered with the Legal Aid Society of San Bernardino (LASSB) to address a critical bottleneck in their service delivery: the time-consuming process of drafting reasonable accommodation demand letters for tenants with disabilities.

This report details the problem identified by LASSB, the proposed AI-powered solution developed by the student team, and recommendations for future development and implementation.

Thank you to students in this team: Max Bosel, Adam Golomb, Jay Li, Mitra Solomon, and Julia Stroinska. And a big thank you to our LASSB colleagues: Greg Armstrong, Pablo Ramirez, and more

The Housing Accommodation Demand Letter Task

The Legal Aid Society of San Bernardino (LASSB) is a nonprofit law firm providing free legal services to low-income residents in San Bernardino County, California. Among their clients are tenants with disabilities who often need reasonable accommodation demand letters to request changes from landlords (for example, allowing a service animal in a “no pets” building).

These demand letters are formal written requests asserting tenants’ rights under laws like the Americans with Disabilities Act (ADA) and Fair Housing Act (FHA). They are crucial for tenants to secure accommodations and avoid eviction, but drafting them properly is time-consuming and requires legal expertise. LASSB faces overwhelming demand for help in this area – its hotline receives on the order of 100+ calls per day from tenants seeking assistance.

However, LASSB has only a handful of intake paralegals and housing attorneys available, meaning many callers must wait a long time or never get through. In fact, LASSB serves around 9–10,000 clients per year via the hotline, yet an estimated 15,000 additional calls never reach help due to capacity limits. Even for clients who do get assistance, drafting a personalized, legally sound letter can take hours of an attorney’s time. With such limited staffing, LASSB’s attorneys are stretched thin, and some eligible clients may end up without a well-crafted demand letter to assert their rights.

LASSB presented their current workflow and questions about AI opportunities in September 2024, and a team of students in AI for Legal Help formed to partner on this task and explore an AI-powered solution.

The initial question from LASSB was whether we could leverage recent advances in AI to draft high-quality demand letter templates automatically, thereby relieving some burden on staff and improving capacity to serve clients. The goal was to have an AI system gather information from the client and produce a solid first draft letter that an attorney could then quickly review and approve. By doing so, LASSB hoped to streamline the demand-letter workflow – saving attorney time, reducing errors or inconsistencies, and ensuring more clients receive help.

Importantly, any AI agent would not replace attorney judgment or final sign-off. Rather, it would act as a virtual assistant or co-pilot: handling the routine drafting labor while LASSB staff maintain complete control over the final output. Key objectives set by the partner included improving efficiency, consistency, and accessibility of the service, while remaining legally compliant and user-friendly. In summary, LASSB needed a way to draft reasonable accommodation letters faster without compromising quality.

After two quarters of work, the class teams proposed a Demand Letter AI system, creating a prototype AI agent that would interview clients about their situation and automatically generates a draft accommodation request letter. This letter would cite the relevant laws and follow LASSB’s format, ready for an attorney’s review. By adopting such a tool, LASSB hopes to minimize the time attorneys spend on repetitive drafting tasks and free them to focus on providing direct counsel and representation. The remainder of this report details the use case rationale, the current vs. envisioned workflow, the technical prototyping process, evaluation approach, and recommendations for next steps in developing this AI-assisted demand letter system.

Why is the Demand Letter Task a good fit for AI?

Reasonable accommodation demand letters for tenants with disabilities were chosen as the focus use case for several reasons.

The need is undeniably high: as noted, LASSB receives a tremendous volume of housing-related calls, and many involve disabled tenants facing issues like a landlord refusing an exception to a policy (no-pets rules, parking accommodations, unit modifications, etc.). These letters are often the gateway to justice for such clients – a well-crafted letter can persuade a landlord to comply without the tenant ever needing to file a complaint or lawsuit. Demand letters are a high-impact intervention that can prevent evictions and ensure stable housing for vulnerable tenants. Focusing on this use case meant the project could directly improve outcomes for a large number of people, aligning with LASSB’s mission of “justice without barriers – equitable access for all.”

At the same time, drafting each letter individually is labor-intensive. Attorneys must gather the details of the tenant’s disability and accommodation request, explain the legal basis (e.g. FHA and California law), and compose a polite but firm letter to the landlord. With LASSB’s staff attorneys handling heavy caseloads, these letters sometimes get delayed or delegated to clients themselves to write (with mixed results). Inconsistent quality and lack of time for thorough review are known issues. This use case presented a clear opportunity for AI to assist to improve the consistency and quality of the letter itself.

The task of writing letters is largely document-generation – a pattern that advanced language models are well-suited for. Demand letters follow a relatively standard structure (explain who you are, state the request, cite laws, etc.), and LASSB already uses templates and boilerplate language for some sections. This means an AI could be trained or prompted to follow that format and fill in the specifics for each client. By leveraging an AI to draft the bulk of the text, each letter could be produced much faster, with the model handling the repetitive phrasing and legal citations while the attorney only needs to make corrections or additions.

Crucially, using AI here could increase LASSB’s capacity. Rather than an attorney spending, say, 2-3 hours composing a letter from scratch, the AI might generate a solid draft in minutes, requiring perhaps 15 minutes of review and editing. The project team estimated that integrating an AI tool into the workflow could save on the order of 1.5–2.5 hours per client in total staff time. Scaled over dozens of cases, those saved hours mean more clients served and shorter wait times for help. This efficiency gain is attractive to funders and legal aid leaders because it stretches scarce resources further.

AI can help enforce consistency and accuracy. It would use the same approved legal language across all letters, reducing the chance of human error or omissions in the text. For clients, this translates into a more reliable service – they are more likely to receive a well-written letter regardless of which attorney or volunteer is assisting them.

The reasonable accommodation letter use case was selected because it sits at the sweet spot of high importance and high potential for automation. It addresses a pressing need for LASSB’s clients (ensuring disabled tenants can assert their rights) and plays to AI’s strengths (generating structured documents from templates and data). By starting with this use case, the project aimed to deliver a tangible, impactful tool that could quickly demonstrate value – a prototype AI assistant that materially improves the legal aid workflow for a critical class of cases.

Workflow Vision:

From Current Demand Letter Process to Future AI-Human Collaboration

To understand the impact of the proposed solution, it’s important to compare the current human-driven workflow of creating Demand Letters and the envisioned future workflow where an AI assistant is integrated. Below, we outline the step-by-step process today and how it would change with the AI prototype in place.

Current Demand Letter Workflow (Status Quo)

When a tenant with a disability encounters an issue with their landlord (for example, the landlord is refusing an accommodation or threatening eviction over a disability-related issue), the tenant must navigate several steps to get a demand letter:

Initial Intake Call: The tenant contacts LASSB’s hotline and speaks to an intake call-taker (often a paralegal). The tenant explains their situation and disability, and the intake worker records basic information and performs eligibility screening (checking income, conflict of interest, etc.). If the caller is eligible and the issue is within LASSB’s scope, the case is referred to a housing attorney for follow-up.
Attorney Consultation: The tenant then has to repeat their story to a housing attorney (often days later). The attorney conducts a more in-depth interview about the tenant’s disability needs and the accommodation they seek. At this stage, the attorney determines if a reasonable accommodation letter is the appropriate course of action. (If not – for example, if the problem requires a different remedy – the attorney would advise on next steps outside the demand letter process.)
Letter Drafting: If a demand letter is warranted, the process for drafting it is currently inconsistent. In some cases, the attorney provides the client with a template or “self-help” packet on how to write a demand letter and asks the client to draft it themselves. In other cases, the attorney or a paralegal might draft the letter on the client’s behalf. With limited time, attorneys often cannot draft every letter from scratch, so the level of assistance varies. Clients may end up writing the first draft on their own, which can lead to incomplete or less effective letters. (One LASSB attorney noted that tenants frequently have to “explain their story at least twice” – to the intake worker and attorney – “and then have to draft/send the demand letter with varying levels of help”.)
Review and Delivery: Ideally, if the client drafts the letter, they will bring it back for the attorney to review and approve. Due to time pressures, however, attorney review isn’t always thorough, and sometimes letters go out without a detailed legal polish. Finally, the tenant sends the demand letter to the landlord, either by mail or email (or occasionally LASSB sends it on the client’s behalf). At this point, the process relies on the landlord’s response; LASSB’s involvement usually ends unless further action (like litigation) is needed.

This current workflow places a heavy burden on the tenant and the attorney. The tenant must navigate multiple conversations and may end up essentially drafting their own legal letter. The attorney must spend time either coaching the client through writing or drafting the letter themselves, on top of all their other cases. Important information can slip through the cracks when the client is interviewed multiple times by different people. There is also no consistent tracking of what advice or templates were given to the client, leading to variability in outcomes. Overall, the process can be slow (each step often spreads over days or weeks of delay) and resource-intensive, contributing to the bottleneck in serving clients.

Proposed AI-Assisted Workflow (Future Vision)

In the reimagined process, an AI agent would streamline the stages between intake and letter delivery, working in tandem with LASSB staff.

After a human intake screens the client, the AI Demand Letter Assistant takes over the interview to gather facts and draft the letter. The attorney then reviews the draft and finalizes the letter for the client to send.

Post-Intake AI Interview: Once a client has been screened and accepted for services by LASSB’s intake staff, the AI Demand Letter Assistant engages the client in a conversation (via chat or a guided web form; a phone interface could also be possible). The AI introduces itself as a virtual assistant working with LASSB and uses a structured but conversational script to collect all information relevant to the accommodation request. This includes the client’s basic details, details of the disability and needed accommodation, the landlord’s information, and any prior communications or incidents (e.g. if the tenant has asked before or if the landlord has issued notices). The assistant is programmed to use trauma-informed language – it asks questions in a supportive, non-threatening manner and adjusts wording to the client’s comfort, recognizing that relaying one’s disability needs can be sensitive. Throughout the interview, the AI can also perform helpful utilities, such as inserting the current date or formatting addresses correctly, to ensure the data it gathers is ready for a letter.
Automatic Letter Generation: After the AI has gathered all the necessary facts from the client, it automatically generates a draft demand letter. The generation is based on LASSB-approved templates and includes the proper formal letter format (date, addresses, RE: line, etc.), a clear statement of the accommodation request, and citations to relevant laws/regulations (like referencing the FHA, ADA, or state law provisions that apply). The AI uses the information provided by the client to fill in key details – for example, describing the tenant’s situation (“Jane Doe, who has an anxiety disorder, requests an exception to the no-pets policy to allow her service dog”) and customizing the legal rationale to that scenario. Because the AI has been trained on example letters and legal guidelines, it can include the correct legal language to strengthen the demand. It also ensures the tone remains polite and professional. At the end of this step, the AI has a complete draft letter ready.
Attorney Review & Collaboration: The draft letter, along with a summary of the client’s input or a transcript of the Q&A, is then forwarded to a LASSB housing attorney for review. The attorney remains the ultimate decision-maker – they will read the AI-drafted letter and check it for accuracy, appropriate tone, and effectiveness. If needed, the attorney can edit the letter (either directly or by giving feedback to the AI to regenerate specific sections). The AI could also highlight any uncertainties (for instance, if the client’s explanation was unclear on a point, the draft might flag that for attorney clarification). Importantly, no letter is sent out without attorney approval, ensuring that professional legal judgment is applied. This human-in-the-loop review addresses ethical duties (attorneys must supervise AI work as they would a junior staffer) and maintains quality control. In essence, the AI does the first 90% of the drafting, and the attorney provides the final 10% refinement and sign-off.
Delivery and Follow-Up: After the attorney finalizes the content, the letter is ready to be delivered to the landlord. In the future vision, this could be as simple as clicking a button to send the letter via email or printing it for mailing. (The prototype also floated ideas like integrating with DocuSign or generating a PDF that the client can download and sign.) The client then sends the demand letter to the landlord, formally requesting the accommodation. Ideally, this happens much faster than in the current process – potentially the same day as the attorney consultation, since the drafting is near-instant. LASSB envisioned that the AI might even assist in follow-up: for instance, checking back with the client a couple weeks later to ask if the landlord responded, and if not, suggesting next steps. (This follow-up feature was discussed conceptually, though not implemented in the prototype.) In any case, by the end of the workflow, the client has a professionally crafted letter in hand, and they did not have to write it alone.

The benefits of this AI-human collaboration are significant. It eliminates the awkward gap where a client might be left drafting a letter on their own; instead, the client is guided through questions by the AI and sees a letter magically produced from their answers. It also reduces duplicate interviewing – the client tells their full story once to the AI (after intake), rather than explaining it to multiple people in pieces.

For the attorney, the time required to produce a letter drops dramatically. Rather than spending a couple of hours writing and editing, an attorney might spend 10–20 minutes reviewing the AI’s draft, tweaking a phrase or two, and approving it. The team’s estimates suggest each case could save on the order of 1.5–2.5 hours of staff time under this new workflow. Those savings translate into lower wait times and the ability for LASSB to assist many more clients in a given period with the same staff. In broader terms, more tenants would receive the help they need, fewer calls would be abandoned, and LASSB’s attorneys could devote more attention to complex cases (since straightforward letters are handled in part by the AI).

The intended impact is “more LASSB clients have their day in court… more fair and equitable access to justice for all”, as the student team put it – in this context meaning more clients are able to assert their rights through demand letters, addressing issues before they escalate. The future vision sees the AI prototype seamlessly embedded into LASSB’s service delivery: after a client is screened by a human, the AI takes on the heavy lifting of information gathering and document drafting, and the human attorney ensures the final product meets the high standards of legal practice. This collaboration could save time, improve consistency, and ultimately empower more tenants with disabilities to get the accommodations they need to live safely and with dignity.

Technical Approach and Prototyping: What We Built and How It Works

With the use case defined, the project team proceeded to design and build a working prototype AI agent for demand letter drafting. This involved an iterative process of technical development, testing, and refinement over two academic quarters. In this section, we describe the technical solution – including early prototypes, the final architecture, and how the system functions under the hood.

Early Prototype and Pivot

In Autumn 2024, the team’s initial prototype focused on an AI intake interviewing agent (nicknamed “iNtake”) as well as a rudimentary letter generator. They experimented with a voice-based assistant that could talk to clients over the phone. Using tools like Twilio (for telephony and text messaging) and Google’s Dialogflow/Chatbot interfaces, they set up a system where a client could call a number and interact with an AI-driven phone menu. The AI would ask the intake questions in a predefined script and record the answers.

Behind the scenes, the prototype leveraged a large language model (LLM) – essentially an AI text-generation engine – to handle the conversational aspect. The team used a model configuration referred to as “gemini-1.5-flash”, which was integrated into the phone chatbot.

This early system demonstrated some capabilities (it could hold a conversation and hand off to a human if needed), but also revealed significant challenges. The script was over 100 questions long and not trauma-informed – users found it tedious and perhaps impersonal. Additionally, the AI sometimes struggled with the decision-tree logic of intake.

After several iterations and feedback from instructors and LASSB, the team decided to pivot. They narrowed the scope to concentrate on the Demand Letter Agent – a chatbot that would come after intake to draft the letter. The phone-based intake AI became a separate effort (handled by another team in Winter 2025), while our team focused on the letter generator.

Final Prototype Design

The Winter 2025 team built upon the fall work to create a functioning AI chat assistant for demand letters. The prototype operates as an interactive chatbot that can be used via a web interface (in testing, it was run on a laptop, but it could be integrated into LASSB’s website or a messaging platform in the future). Here’s how it works in technical terms.

The AI agent was developed using a generative Large Language Model (LLM) – similar to the technology behind GPT-4 or other modern conversational AIs. This model was not trained from scratch by the team (which would require huge data and compute); instead, the team used a pre-existing model and focused on customizing it through prompt engineering and providing domain-specific data. In practical terms, the team created a structured “AI playbook” or prompt script that guides the model step-by-step to perform the task.

Data and Knowledge Integration

One of the first steps was gathering all relevant reference material to inform the AI’s outputs. The team collected LASSB’s historical demand letters (redacted for privacy), which provided examples of well-written accommodation letters. They also pulled in legal sources and guidelines: for instance, the U.S. Department of Justice’s guidance memos on reasonable accommodations, HUD guidelines, trauma-informed interviewing guidelines, and lists of common accommodations and impairments. These documents were used to refine the AI’s knowledge.

Rather than blindly trusting the base model, the team explicitly incorporated key legal facts – such as definitions of “reasonable accommodation” and the exact language of FHA/FEHA requirements – into the AI’s prompt or as reference text the AI could draw upon. Essentially, the AI was primed with: “Here are the laws and an example demand letter; now follow this format when drafting a new letter.” This helped ensure the output letters would be legally accurate and on-point.

Prompt Engineering

The heart of the prototype is a carefully designed prompt/instruction set given to the AI model. The team gave the AI a persona and explicit instructions on how to conduct the conversation and draft the letter. For example, the assistant introduces itself as “Sofia, the Legal Aid Society of San Bernardino’s Virtual Assistant” and explains its role to the client (to help draft a letter). The prompt includes step-by-step instructions for the interview: ask the client’s name, ask what accommodation they need, confirm details, etc., in a logical order (it’s almost like a decision-tree written in natural language form). A snippet of the prompt (from the “Generative AI playbook”) is shown below:

Excerpt from the AI assistant’s instruction script. The agent is given a line-by-line guide to greet the client, collect information (names, addresses, disability details, etc.), and even call a date-time tool to insert the current date for the letter.

The prompt also explicitly instructs the AI on legal and ethical boundaries. For instance, it was told: “Your goal is to write and generate a demand letter for reasonable accommodations… You do not provide legal advice; you only assist with drafting the letter.”. This was crucial to prevent the AI from straying into giving advice or making legal determinations, which must remain the attorney’s domain. By iteratively testing and refining this prompt, the team taught the AI to stay in its lane: ask relevant questions, be polite and empathetic, and focus on producing the letter.

Trauma-Informed and Bias-Mitigation Features

A major design consideration was ensuring the AI’s tone and behavior were appropriate for vulnerable clients. The team trained the AI (through examples and instructions) to use empathetic language – e.g., thanking the client for sharing information, acknowledging difficulties – and to avoid any phrasing that might come off as judgmental or overly clinical. The AI was also instructed to use the client’s own words when possible and not to press sensitive details unnecessarily. On the technical side, the model was tested for biases. The team used diverse example scenarios to ensure the AI’s responses wouldn’t differ inappropriately based on the nature of the disability or other client attributes. Regular audits of outputs were done to catch any bias. For example, they made sure the AI did not default to male pronouns for landlords or assume anything stereotypical about a client’s condition. These measures align with best practices to ensure the AI’s output is fair and respects all users.

Automated Tools Integration

The prototype included some clever integrations of simple tools to enhance accuracy. One such tool was a date function. In early tests, the AI sometimes forgot to put the current date on the letter or used a generic placeholder. To fix this, the team connected the AI to a utility that fetches the current date. During the conversation, if the user is ready to draft the letter, the AI will call this date function and insert the actual current date into the letter heading. This ensures the generated letter always shows (for example) “May 19, 2023” rather than a hardcoded date. Similarly, the AI was guided to properly format addresses and other elements (it asks for each component like city, state, ZIP and then concatenates them in the letter format). These might seem like small details, but they significantly improve the professionalism of the output.

Draft Letter Generation

Once the AI has all the needed info, it composes the letter in real-time. It follows the structure from the prompt and templates: the letter opens with the date and address, a reference line (“RE: Request for Reasonable Accommodation”), a greeting, and an introduction of the client. Then it lays out the request and the justification, citing the laws, and closes with a polite sign-off. The content of the letter is directly based on the client’s answers. For instance, if the client said they have an anxiety disorder and a service dog, the letter will include those details and explain why the dog is needed. The AI’s legal knowledge ensures that it inserts the correct references to the FHA and California Fair Employment and Housing Act (FEHA), explaining that landlords must provide reasonable accommodations unless it’s an undue burden.

An example output is shown below:

Sample excerpt from an AI-generated reasonable accommodation letter. In this case, the tenant (Jane Doe) is requesting an exception to a “no pets” policy to allow her service dog. The AI’s draft includes the relevant law citations (FHA and FEHA) and a clear explanation of why the accommodation is necessary.

As seen in the example above, the AI’s letter closely resembles one an attorney might write. It addresses the landlord respectfully (“Dear Mr. Jones”), states the tenant’s name and address, and the accommodation requested (permission to keep a service animal despite a no-pet policy). It then cites the Fair Housing Act and California law, explaining that these laws require exceptions to no-pet rules as a reasonable accommodation for persons with disabilities. It describes the tenant’s specific circumstances (the service dog helps manage her anxiety, etc.) in a factual and supportive tone. It concludes with a request for a response within a timeframe and a polite thank you. All of this text was generated by the AI based on patterns it learned from training data and the prompt instructions – the team did not manually write any of these sentences for this particular letter, showing the generative power of the AI. The attorney’s role would then be to review this draft.

In our tests, attorneys found the drafts to be surprisingly comprehensive. They might only need to tweak a phrase or add a specific detail. For example, an attorney might insert a line offering to provide medical documentation if needed, or adjust the deadline given to the landlord. But overall, the AI-generated letters were on point and required only light editing.

Testing and Iteration

The development of the prototype involved iterative testing and debugging. Early on, the team encountered some issues typical of advanced AI systems and worked to address them.

Getting the agent to perform consistently

Initially, the AI misunderstood its task at times. In the first demos, when asked to draft a letter, the AI would occasionally respond with “I’m sorry, I can’t write a letter for you”, treating it like a prohibited action. This happened because base language models often have safety rules about not producing legal documents. The team resolved this by refining the prompt to clarify that the AI is allowed and expected to draft the letter as part of its role (since an attorney will review it). Once the AI “understood” it had permission to assist, it complied.

Ensuring the agent produced the right output

The AI also sometimes ended the interview without producing the letter. Test runs showed that if the user didn’t explicitly ask for the letter, the AI might stop after gathering info. To fix this, the team adjusted the instructions to explicitly tell the AI that once it has all the information, it should automatically present the draft letter to the client for review. After adding this, the AI reliably output the draft at the end of the conversation.

We sometimes had the agent offering to do unsolicited tasks, like sending an email. That wasn’t in the configuration, but it was improvising off-script.

Un-sticking the agent, caught in a loop

There were issues with the AI getting stuck or repeating itself. For example, in one scenario, the AI began to loop, apologizing and asking the same question multiple times even after the user answered.

A screenshot from testing shows the AI repeating “Sorry, something went wrong, can you repeat?” in a loop when it hit an unexpected input. These glitches were tricky to debug – the team adjusted the conversation flow and added checks (like if the user already answered, do not ask again), which reduced but did not completely eliminate such looping. We identified that these loops often stemmed from the model’s uncertainty or minor differences in phrasing that weren’t accounted for in the script.

Dealing with fake or inaccurate info

Another issue was occasional hallucinations or extraneous content. For instance, the AI at one point started offering to “email the letter to the landlord” out of nowhere, even though that wasn’t in its instructions (and it had no email capability). This was the model improvising beyond its intended scope. The team addressed this by tightening the prompt instructions, explicitly telling the AI not to do anything with email and to stick to generating the letter text only. After adding such constraints, these hallucinations became rarer.

Getting consistent letter formatting

The formatting of the letter (dates, addresses, signature line) needed fine-tuning. The AI initially had minor formatting quirks (like sometimes missing the landlord’s address or not knowing how to sign off). By providing a template example and explicitly instructing the inclusion of those elements, the final prototype reliably produced a correctly formatted letter with a placeholder for the client’s signature.

Throughout development, whenever an issue was discovered, the team would update the prompt or the data and test again. This iterative loop – test, observe output, refine instructions – is a hallmark of developing AI solutions and was very much present in this project.

Over time, the outputs improved significantly in quality and reliability. For example, by the end of the Winter quarter, the AI was consistently using the correct current date (thanks to the date tool integration) and writing in a supportive tone (thanks to the trauma-informed training), which were clear improvements from earlier versions. That said, some challenges remained unsolved due to time limits.

The AI still showed some inconsistent behaviors occasionally – such as repeating a question in a rare case, or failing to recognize an atypical user response (like if a user gave an extremely long-winded answer that confused the model). The team documented these lingering issues so that future developers can target them. They suspected that further fine-tuning of the model or using a more advanced model could help mitigate these quirks.

In its final state at the end of Winter 2025, the prototype was able to conduct a full simulated interview and generate a reasonable accommodation demand letter that LASSB attorneys felt was about 80–90% ready to send, requiring only minor edits.

The technical architecture was a single-page web application interfacing with the AI model (running on a cloud AI platform) plus some back-end scripts for the date tool and data storage. It was not yet integrated into LASSB’s production systems, but it provided a compelling proof-of-concept.

Observers in the final presentation could watch “Sofia” chat with a hypothetical client (e.g., Martin who needed an emotional support animal) and within minutes, produce a letter addressed to the landlord citing the FHA – something that would normally take an attorney a couple of hours.

Overall, the technical journey of this project was one of rapid prototyping and user-centered adjustment. The team combined off-the-shelf AI technology with domain-specific knowledge to craft a tool tailored for legal aid. They learned how small changes in instructions can greatly affect an AI’s behavior, and they progressively molded the system to align with LASSB’s needs and values. The result is a working prototype of an AI legal assistant that shows real promise in easing the burden of document drafting in a legal aid context.

Evaluation Framework: Testing, Quality Standards, and Lessons Learned

From the outset, the team and LASSB agreed that rigorous evaluation would be critical before any AI tool could be deployed in practice. The project developed an evaluation framework to measure the prototype’s performance and ensure it met both efficiency goals and legal quality standards. Additionally, throughout development the team reflected on broader lessons learned about using AI in a legal aid environment. This section discusses the evaluation criteria, testing methods, and key insights gained. Quality Standards and Benchmarks: The primary measure of success for the AI-generated letters was that they be indistinguishable (in quality) from letters written by a competent housing attorney. To that end, the team established several concrete quality benchmarks:

No “Hallucinations”: The AI draft should contain no fabricated facts, case law, or false statements. All information in the letter must come from the client’s provided data or be generally accepted legal knowledge. For example, the AI should never cite a law that doesn’t exist or insert details about the tenant’s situation that the tenant didn’t actually tell it. Attorneys reviewing the letters specifically check for any such hallucinated content.
Legal Accuracy: Any legal assertions in the letter (e.g. quoting the Fair Housing Act’s requirements) must be precisely correct. The letter should not misstate the law or the landlord’s obligations. Including direct quotes or citations from statutes/regulations was one method used to ensure accuracy. LASSB attorneys would verify that the AI correctly references ADA, FHA, FEHA, or other laws as applicable.
Proper Structure and Tone: The format of the letter should match what LASSB attorneys expect in a formal demand letter. That means: the letter has a date, addresses for both parties, a clear subject line, an introduction, body paragraphs that state the request and legal basis, and a courteous closing. The tone should be professional – firm but not aggressive, and certainly not rude. One benchmark was that an AI-drafted letter “reads like” an attorney’s letter in terms of formality and clarity. If an attorney would normally include or avoid certain phrases (for instance, saying “Thank you for your attention to this matter” at the end, or avoiding contractions in a formal letter), the AI’s output is expected to do the same.
Completeness: The letter should cover all key points necessary to advocate for the client. This includes specifying the accommodation being requested, briefly describing the disability connection, citing the legal right to the accommodation, and possibly mentioning an attached verification if relevant. An incomplete letter (one that, say, only requests but doesn’t cite any law) would not meet the standard. Attorneys reviewing would ensure nothing crucial was missing from the draft.

In addition to letter quality, efficiency metrics were part of the evaluation. The team intended to log how long the AI-agent conversation took and how long the model took to generate the letter, aiming to show a reduction in total turnaround time compared to the status quo. Another metric was the effect on LASSB’s capacity: for example, could implementing this tool reduce the number of calls that drop off due to long waits? In theory, if attorneys spend less time per client, more calls can be returned. The team proposed tracking number of clients served before and after deploying the AI as a long-term metric of success.

Evaluation Methods

To assess these criteria, the evaluation plan included several components.

Internal Performance Testing

The team performed timed trials of the AI system. They measured the duration of a full simulated interview and letter draft generation. In later versions, the interview took roughly 10–15 minutes (depending on how much detail the client gives), and the letter was generated almost instantly thereafter (within a few seconds). They compared this to an estimate of human drafting time. These trials demonstrated the raw efficiency gain – a consistent turnaround of under 20 minutes for a draft letter, which is far better than the days or weeks it might take in the normal process. They also tracked if any technical slowdowns occurred (for instance, if the AI had to call external tools like the date function, did that introduce delays? It did not measurably – the date lookup was near-instant).

Expert Review (Quality Control)

LASSB attorneys and subject matter experts were involved in reviewing the AI-generated letters. The team conducted sessions where an attorney would read an AI draft and score it on accuracy, tone, and completeness. The feedback from these reviews was generally positive – attorneys found the drafts surprisingly thorough. They did note small issues (e.g., “we wouldn’t normally use this phrasing” or “the letter should also mention that the client can provide a doctor’s note if needed”).

These observations were fed back into improving the prompt. The expert review process is something that would continue regularly if the tool is deployed: LASSB could institute, say, a policy that attorneys must double-check every AI-drafted letter and log any errors or required changes. Over time, this can be used to measure whether the AI’s quality is improving (i.e., fewer edits needed).

User Feedback

Another angle was evaluating the system’s usability and acceptance by both LASSB staff and clients. The team gathered informal feedback from users who tried the chatbot demo (including a couple of law students role-playing as clients). They also got input from LASSB’s intake staff on whether they felt such a chatbot would be helpful. In a deployed scenario, the plan is to collect structured feedback via surveys. For example, clients could be asked if they found the virtual interview process easy to understand, and attorneys could be surveyed on their satisfaction with the draft letters. High satisfaction ratings would indicate the system is meeting needs, whereas any patterns of confusion or dissatisfaction would signal where to improve (perhaps the interface or the language the AI uses).

Long-term Monitoring

The evaluation framework emphasizes that evaluation isn’t a one-time event. The team recommended continuous monitoring if the prototype moves to production. This would involve regular check-ins (monthly or quarterly meetings) among stakeholders – the legal aid attorneys, paralegals, technical team, etc. – to review how things are going. They could review statistics (number of letters generated, average time saved) and any incidents (e.g., “the AI produced an incorrect statement in a letter on March 3, we caught it in review”). This ongoing evaluation ensures that any emerging issues (perhaps a new type of accommodation request the AI wasn’t trained on) are caught and addressed. It’s akin to maintenance: the AI tool would be continually refined based on real-world use data to ensure it remains effective and trustworthy.

Risk and Ethical Considerations

Part of the evaluation also involved analyzing potential risks. The team did a thorough risk, ethics, and regulation analysis in their final report to make sure any deployment of the AI would adhere to legal and professional standards. Some key points from that analysis:

Data Privacy & Security

The AI will be handling sensitive client information (details about disabilities, etc.). The team stressed the need for strict privacy safeguards – for instance, if using cloud AI services, ensuring they are HIPAA-compliant or covered by appropriate data agreements. They proposed measures like encryption of stored transcripts and obtaining client consent for using an AI tool. Any integration with LASSB’s case management (LegalServer) would have to follow data protection policies.

Bias and Fairness

They cautioned that AI models can inadvertently produce biased outputs if not properly checked. For example, might the AI’s phrasing be less accommodating to a client with a certain type of disability due to training data bias? The mitigation is ongoing bias testing and using a diverse dataset for development. The project incorporated an ethical oversight process to regularly audit letters for any bias or inappropriate language.

Acceptance by Courts/Opposing Parties

A unique consideration for legal documents is whether an AI-drafted letter (or brief) will be treated differently by its recipient. The team noted recent cases of courts being skeptical of lawyers’ use of ChatGPT, emphasizing lawyers’ duty to verify AI outputs. For demand letters (which are not filed in court but sent to landlords), the risk is lower than in litigation, but still LASSB must ensure the letters are accurate to maintain credibility. If a case did go to court, an attorney might need to attest that they supervised the drafting. Essentially, maintaining transparency and trust is important – LASSB might choose to inform clients about the AI-assisted system (to manage expectations) and would certainly ensure any letter that ends up as evidence has been vetted by an attorney.

Professional Responsibility

The team aligned the project with guidance from the American Bar Association and California State Bar on AI in law practice. These guidelines say that using AI is permissible as long as attorneys ensure competence, confidentiality, and no unreasonable fees are charged for it. In practice, that means LASSB attorneys must be trained on how to use the AI tool correctly, must keep client data safe, and must review the AI’s work. The attorney remains ultimately responsible for the content of the letter. The project’s design – always having a human in the loop – was very much informed by these professional standards.

Lessons Learned

Over the course of the project, the team gained valuable insights, both in terms of the technology and the human element of implementing AI in legal services. Some of the key lessons include the following.

AI is an Augmenting Tool, Not a Replacement for Human Expertise

Perhaps the most important realization was that AI cannot replace human empathy or judgment in legal aid. The team initially hoped the AI might handle more of the process autonomously, but they learned that the human touch is irreplaceable for sensitive client interactions. For example, the AI can draft a letter, but it cannot (and should not) decide whether a client should get a letter or what strategic advice to give – that remains with the attorney. Moreover, clients often need empathy and reassurance that an AI cannot provide on its own. As one reflection noted, the AI might be very efficient, “however, we learned that AI cannot replace human empathy, which is why the final draft letter always goes to an attorney for final review and client-centered adjustment.” In practice, the AI assists, and the attorney still personalizes the counsel.

Importance of Partner Collaboration and User-Centered Design

The close collaboration with LASSB staff was crucial. Early on, the team had some misaligned assumptions (e.g., focusing on a technical solution that wasn’t actually practical in LASSB’s context, like the phone intake bot). By frequently communicating with the partner – including weekly check-ins and showing prototype demos – the team was able to pivot and refine the solution to fit what LASSB would actually use. One lesson was to always “keep the end user in mind”. In this case, the end users were both the LASSB attorneys and the clients. Every design decision (from the tone of the chatbot to the format of the output) was run through the filter of “Is this going to work for the people who have to use it?” For instance, the move from a phone interface to a chat interface was influenced by partner feedback that a phone bot might be less practical, whereas a web-based chat that produces a printable letter fits more naturally into their workflow.

Prototype Iteratively and Be Willing to Pivot

The project reinforced the value of an iterative, agile approach. The team did not stick stubbornly to the initial plan when it proved flawed. They gathered data (user feedback, technical performance data) and made a mid-course correction to narrow the project’s scope. This pivot ultimately led to a more successful outcome. The lesson for future projects is to embrace flexibility – it’s better to achieve a smaller goal that truly works than to chase a grand vision that doesn’t materialize. As noted in the team’s retrospective, “Be willing to pivot and challenge assumptions” was key to their progress.

AI Development Requires Cross-Disciplinary Skills

The students came from law and engineering backgrounds, and both skill sets were needed. They had to “upskill to learn what you need” on the fly – for example, law students learned some prompt-engineering and coding; engineering students learned about fair housing law and legal ethics. For legal aid organizations, this is a lesson that implementing AI will likely require new trainings and collaboration between attorneys and tech experts.

AI Output Continues to Improve with Feedback

Another positive lesson was that the AI’s performance did improve significantly with targeted adjustments. Initially, some doubted whether a model could ever draft a decent legal letter. But by the end, the results were quite compelling. This taught the team that small tweaks can yield big gains in AI behavior – you just have to systematically identify what isn’t working (e.g., the AI refusing to write, or using the wrong tone) and address it. It’s an ongoing process of refinement, which doesn’t end when the class ends. The team recognized that deploying an AI tool means committing to monitor and improve it continuously. As they put it, “there is always more that can be done to improve the models – make them more informed, reliable, thorough, ethical, etc.”. This mindset of continuous improvement is itself a key lesson, ensuring that complacency doesn’t set in just because the prototype works in a demo.

Ethical Guardrails Are Essential and Feasible

Initially, there was concern about whether an AI could be used ethically for legal drafting. The project showed that with the right guardrails – human oversight, clear ethical policies, transparency – it is not only possible but can be aligned with professional standards. The lesson is that legal aid organizations can innovate with AI responsibly, as long as they proactively address issues of confidentiality, accuracy, and attorney accountability. LASSB leadership was very interested in the tool but also understandably cautious; seeing the ethical framework helped build their confidence that this could be done in a way that enhances service quality rather than risks it.

In conclusion, the evaluation phase of the project confirmed that the AI prototype can meet high quality standards (with attorney oversight) and significantly improve efficiency. It also surfaced areas to watch – for example, ensuring the AI remains updated and bias-free – which will require ongoing evaluation post-deployment. The lessons learned provide a roadmap for both this project and similar initiatives: keep the technology user-centered, maintain rigorous quality checks, and remember that AI is best used to augment human experts, not replace them. By adhering to these principles, LASSB and other legal aid groups can harness AI’s benefits while upholding their duty to clients and justice.

Next Steps

Future Development, Open Questions, and Recommendations

The successful prototyping of the AI demand letter assistant is just the beginning. Moving forward, there are several steps to be taken before this tool can be fully implemented in production at LASSB. The project team compiled a set of recommendations and priorities for future development, as well as open questions that need to be addressed. Below is an outline of the next steps:

Expand and Refine the Training Data

To improve the AI’s consistency and reliability, the next development team should incorporate additional data sources into the model’s knowledge base. During Winter 2025, the team gathered a trove of relevant documents (DOJ guidance, HUD memos, sample letters, etc.), but not all of this material was fully integrated into the prototype’s prompts.

Organizing and inputting this data will help the AI handle a wider range of scenarios. For example, there may be types of reasonable accommodations (like a request for a wheelchair ramp installation, or an exemption from a parking fee) that were not explicitly tested yet. Feeding the AI examples or templates of those cases will ensure it can draft letters for various accommodation types, not just the service-animal case.

The Winter team has prepared a well-structured archive of resources and notes for the next team, documenting their reasoning and changes made. It includes, for instance, an explanation of why they decided to focus exclusively on accommodation letters (as opposed to tackling both accommodations and modifications in one agent) – knowledge that will help guide future developers so they don’t reinvent the wheel. Leveraging this prepared data and documentation will be a top priority in the next phase.

Improve the AI’s Reliability and Stability

While the prototype is functional, we observed intermittent issues like the AI repeating itself or getting stuck in loops under certain conditions. Addressing these glitches is critical for a production rollout. The recommendation is to conduct deeper testing and debugging of the model’s behavior under various inputs. Future developers might use techniques like adversarial testing – intentionally inputting confusing or complex information to see where the AI breaks – and then adjusting the prompts or model settings accordingly. There are a few specific issues to fix:

The agent occasionally repeats the same question or answer multiple times (this looping behavior might be due to how the conversation history is managed or a quirk of the model). This needs to be debugged so the AI moves on in the script and doesn’t frustrate the user.
The agent sometimes fails to recognize certain responses – for example, if a user says “Yeah” instead of “Yes,” will it understand? Ensuring the AI can handle different phrasings and a range of user expressions (including when users might go on tangents or express emotion) is important for robustness.
Rarely, the agent might still hallucinate or provide an odd response (e.g., referring to sending an email when it shouldn’t). Further fine-tuning and possibly using a more advanced model with better instruction-following could reduce these occurrences. Exploring the underlying model’s parameters or switching to a model known for higher reliability (if available through the AI platform LASSB chooses) could be an option.

One open question is “why” the model exhibits these occasional errors – it’s often not obvious, because AI models are black boxes to some degree. Future work could involve more diagnostics, such as checking the conversation logs in detail or using interpretability tools to see where the model’s attention is going. Understanding the root causes could lead to more systemic fixes. The team noted that sometimes the model’s mistakes had no clear trigger, which is a reminder that continuous monitoring (as described in evaluation) will be needed even post-launch.

Enhance Usability and Human-AI Collaboration Features

The prototype currently produces a letter draft, but in a real-world setting, the workflow can be made even more user-friendly for both clients and attorneys. Several enhancements are recommended:

Editing Interface

Allow the attorney (or even the client, if appropriate) to easily edit the AI-generated letter in the interface. For instance, after the AI presents the draft, there could be an “Edit” button that opens the text in a word processor-like environment. This would save the attorney from having to copy-paste into a separate document. The edits made could even be fed back to the AI (as learning data) to continuously improve it.

Download/Export Options

Integrate a feature to download the letter as a PDF or Word document. LASSB staff indicated they would want the final letter in a standard format for record-keeping and for the client to send. Automating this (the AI agent could fill a PDF template or use a document assembly tool) would streamline the process. One idea is to integrate with LASSB’s existing document system or use a platform like Documate or Gavel (which LASSB uses for other forms) – the AI could output data into those systems to produce a nicely formatted letter on LASSB letterhead.

Transcript and Summary for Attorneys

When the AI finishes the interview, it can provide not just the letter but also a concise summary of the client’s situation along with the full interview transcript to the attorney. The summary could be a paragraph saying, e.g., “Client Jane Doe requests an exception to no-pet policy for her service dog. Landlord: ABC Properties. No prior requests made. Client has anxiety disorder managed by dog.”

Such a summary, generated automatically, would allow the reviewing attorney to very quickly grasp the context without reading the entire Q&A transcript. The transcript itself should be saved and accessible (perhaps downloadable as well) so the attorney can refer back to any detail if needed. These features will decrease the need for the attorney to re-interview the client, thus preserving the efficiency gains.

User Interface and Guidance

On the client side, ensure the chat interface is easy to use. Future improvements could include adding progress indicators (to show the client how many questions or sections are left), the ability to go back and change an answer, or even a voice option for clients who have difficulty typing (this ties into accessibility, discussed next). Essentially, polish the UI so that it is client-friendly and accessible.

Integration into LASSB’s Workflow

In addition to the front-end enhancements, the tool should be integrated with LASSB’s backend systems. A recommendation is to connect the AI assistant to LASSB’s case management software (LegalServer) via API. This way, when a letter is generated, a copy could automatically be saved to the client’s case file in LegalServer. It could also pull basic info (like the client’s name, address) from LegalServer to avoid re-entering data. Another integration point is the hotline system – if in the future the screening AI is deployed, linking the two AIs could be beneficial (for example, intake answers collected by the screening agent could be passed directly to the letter agent, so the client doesn’t repeat information). These integrations, while technical, would ensure the AI tool fits seamlessly into the existing workflow rather than as a stand-alone app.

Broaden Accessibility and Language Support

San Bernardino County has a diverse population, and LASSB serves many clients for whom English is not a first language or who have disabilities that might make a standard chat interface challenging. Therefore, a key next step is to add multilingual capabilities and other accessibility features. The priority is Spanish language support, as a significant portion of LASSB’s client base is Spanish-speaking. This could involve developing a Spanish version of the AI agent – using a bilingual model or translating the prompt and output. The AI should ideally be able to conduct the interview in Spanish and draft the letter in Spanish, which the attorney could then review (noting that the final letter might need to be in English if sent to an English-speaking landlord, but at least the client interaction can be in their language).

In addition, for clients with visual impairments, the interface should be compatible with screen readers (text-to-speech for the questions, etc.), and for those with low literacy or who prefer oral communication, a voice interface could be offered (perhaps reintroducing a refined version of the phone-based system, but integrated with the letter agent’s logic). Essentially, the tool should follow universal design principles so that no client is left out due to the technology format. This may require consulting accessibility experts and doing user testing with clients who have disabilities.

Plan for Deployment and Pilot Testing

Before a full rollout, the team recommends a controlled pilot phase. In a pilot, a subset of LASSB staff and clients would use the AI tool on actual cases (with close supervision). Data from the pilot – success stories, any problems encountered, time saved metrics – should be collected and evaluated. This will help answer some open questions, such as:

How do clients feel about interacting with an AI for part of their legal help?
Does it change the attorney-client dynamic in any way?
Are there cases where the AI approach doesn’t fit well (for instance, if a client has multiple legal issues intertwined, can the AI handle the nuance or does it confuse things)?

These practical considerations will surface in a pilot. The pilot can also inform best practices for training staff on using the tool. Perhaps attorneys need a short training session on how to review AI drafts effectively, or intake staff need a script to explain to clients what the AI assistant is when transferring them. Developing guidelines and training materials is part of deployment. Additionally, during the pilot, establishing a feedback loop (maybe a weekly meeting to discuss all AI-drafted letters that week) will help ensure any kinks are worked out before scaling up.

Address Open Questions and Long-Term Considerations

Some broader questions remain as this project moves forward.

How to Handle Reasonable Modifications

The current prototype focuses on reasonable accommodations (policy exceptions or services). A related need is reasonable modifications (physical changes to property, like installing a ramp). Initially, the team planned to include both, but they narrowed the scope to accommodations for manageability. Eventually, it would be beneficial to expand the AI’s capabilities to draft modification request letters as well, since the legal framework is similar but not identical. This might involve adding a branch in the conversation: if the client is requesting a physical modification, the letter would cite slightly different laws (e.g., California Civil Code related to modifications) and possibly include different information (like who will pay for the modification, etc.). The team left this as a future expansion area. In the interim, LASSB should be aware that the current AI might need additional training/examples before it can reliably handle modification cases.

Ensuring Ongoing Ethical Compliance

As the tool evolves, LASSB will need to regularly review it against ethical guidelines. For instance, if State Bar rules on AI use get updated, the system’s usage might need to be adjusted. Keeping documentation of how the AI works (so it can be explained to courts if needed) will be important. Questions like “Should clients be informed an AI helped draft this letter?” might arise – currently the plan would be to disclose if asked, but since an attorney is reviewing and signing off, the letter is essentially an attorney work product. LASSB might decide internally whether to be explicit about AI assistance or treat it as part of their workflow like using a template.

Maintenance and Ownership

Who will maintain the AI system long-term? The recommendation is that LASSB identify either an internal team or an external partner (perhaps continuing with Stanford or another tech partner) to assume responsibility for piloting and updates.

AI models and integrations require maintenance – for example, if new housing laws pass, the model/prompt should be updated to include that. If the AI service (API) being used releases a new version that’s better/cheaper, someone should handle the upgrade. Funding might be needed for ongoing API usage costs or server costs. Planning for these practical aspects will ensure the project’s sustainability.

Scaling to Other Use Cases

If the demand letter agent proves successful, it could inspire similar tools for other high-volume legal aid tasks (for instance, generating answers to eviction lawsuits or drafting simple wills). One open question is how easily the approach here can be generalized. The team believes the framework (AI + human review) is generalizable, but each new use case will require its own careful curation of data and prompts.

The success in the housing domain suggests LASSB and Stanford may collaborate to build AI assistants for other domains in the future (like an Unlawful Detainer Answer generator, etc.). This project can serve as a model for those efforts.

Finally, the team offered some encouraging closing thoughts: The progress so far shows that a tool like this “could significantly improve the situation and workload for staff at LASSB, allowing many more clients to receive legal assistance.” There is optimism that, with further development, the AI assistant can be deployed and start making a difference in the community. However, they also caution that “much work remains before this model can reach the deployment phase”.

It will be important for future teams to continue with the same diligent approach – testing, iterating, and addressing the AI’s flaws – rather than rushing to deploy without refinement. The team emphasized a balance of excitement and caution: AI has great potential for legal aid, but it must be implemented thoughtfully. The next steps revolve around deepening the AI’s capabilities, hardening its reliability, improving the user experience, and carefully planning a real-world rollout. By following these recommendations, LASSB can move from a successful prototype to a pilot and eventually to a fully integrated tool that helps their attorneys and clients every day. The vision is that in the near future, a tenant with a disability in San Bernardino can call LASSB and, through a combination of compassionate human lawyers and smart AI assistance, quickly receive a strong demand letter that protects their rights – a true melding of legal expertise and technology to advance access to justice.

With continued effort, collaboration, and care, this prototype AI agent can become an invaluable asset in LASSB’s mission to serve the most vulnerable members of the community. The foundation has been laid; the next steps will bring it to fruition.

Class Blog Project updates

Brainstorming new Language Access self help ideas

Post author By Margaret
Post date December 26, 2018
1 Comment on Brainstorming new Language Access self help ideas

Brainstorming Potential Solutions in the Design for Justice Class: Language Access (Week 3)

By Sahil Chopra

Having experienced the court first hand, we returned to the classroom to revisit the tenets of Design Thinking and coalesce our thoughts, before engaging in a productive, rapid-brainstorming session.

Here’s a quick reminder of 5 “tenets” behind the design philosophies that drove our brainstorming:

There is no “one perfect idea”. In fact, it is quite limiting to focus on “quality” ideas, i.e. those that seem practical or reasonable. In this initial phase of brainstorming, you should let your imagination roam free. You might be surprised by the ways you can turn an unreasonable idea into a truly impactful one.
Don’t judge others. You can only be truly collaborative and helpful if you reserve judgement upon others’ ideas. Don’t analyze. Don’t constrain. Don’t judge.
Be concise and specific. Yes, we all want to help provide language access to millions of Californians; but ideas won’t get us all the way there. In order to brainstorm effectively, you have to think “physical”, i.e. what could you make or build in an ideal world. Don’t think in abstractions but realities.
Always respond to ideas with the phrase “yes and”. Saying “no” and “yes but” are conversation killers. Even if you don’t totally agree with an idea, embrace it and try to add your own spin to it, building upon it by saying “yes and”.
Go for wild, ambitious, and impossible. Think big! We can always scale back later.

With these principles in mind, we drew upon our observations of the prior week to develop a list of current positives and negatives with language access at the court. We then brainstormed a list of potential successes and pitfalls, we might face while trying to improve language access.

Current Positives

Empathy: Sitting in on family court trials and observing the interactions between court staff and clients, it was apparent that those who work at the courthouse truly care. They are overwhelmed and understaffed, but they truly believe in the work and are trying their best to service the hundreds of clients that walk through the door each day.
Pathfinding: Signage was plentiful, though it could be improved by providing multilingual queues. The docket system, hosted on the large vertical flat screens, was especially useful in orienting oneself as they entered the courthouse.

Current Negatives

Form Accessibility: It’s often difficult to know what pieces of ancillary information are needed to fill out the form, which sections pertain to you personally, etc. There are workshops to help people fill out the forms, but they are understaffed; and the videos shown as part of the divorce workshop we observed weren’t entirely helpful, as they did not actually walk through the forms themselves.
Waiting: People line up in the self-help queue starting at 7:00 am, even though the service starts helping individuals at 9:00 am. The wait times are long.
Language: Many people who don’t speak English bring translators, but these must be 18+-year-olds; and involving someone else in the legal process implies that translator must also leave work, skip school, etc.

Future Positives (Ideas)

Real-Time Translation
Human-Oriented Experiences
Space Optimization (Court House)
Efficiency (Simplify Forms, Reduce Lines, etc.)

Future Negatives (Considerations)

Litigation
Budget Cuts / Restrictions
Buy-In: Unions, Staff, Judges, Clients

With these themes established, we brainstormed the following 10 ideas:

Interactive Forms
1. Concept: Make forms interactive on a website such that they become “choose-your-own-adventure.” Use simple questions written in the person’s native language to determine which portions of the form are necessary for the individual to actually fill out.
2. Goal: This should make form-filling a more accessible and personalized experience. Hopefully, this makes the process for filing easier and less intimidating.
Multilingual FAQs
1. Concept: Update the court’s website with FAQs in various languages. Prospective users could read these FAQs for their specific problem before coming to the court itself, so that they have a better understanding of the court process for their issue before coming in. Similarly, these could be provided to those in the self-help line to read before they are served.
2. Goal: This will improve understanding of the court processes in order to empower individuals with a sense of control.
Multilingual Court Navigation Instructions
1. Concept: Create an app or website, with top 5 languages spoken by LEP court users, that explains court layout, functions and services at each office, and language support services.
2. Goal: The user can find answers to common questions on their phone and use it to navigate the courthouse and its services. This will save headaches about what they need to do to get from Point A to point B, both in terms of navigating the courthouse and its services and help customers more easily address their legal needs.
Online Multilingual Workshop Videos
1. Concept: Provide client with multilingual YouTube videos explaining the mechanics of different common problems (e.g. divorce) that people go to the court to address.
2. Goal: Right now the videos are only in English and only viewable in the workshop. This poses double issues for accessibility of content. Multilingual YouTube videos may reduce the burden on the workshop staff and provide a better, informative experience to non-native speakers.
Chunk Workshop Video Into Sections
1. Concept: Split workshop videos into chunks rather than the current 45-minute video. Also, integrate the form-filling within the video watching experience. Rather than a presentation, the workshop videos should directly help the users fill out the necessary and related components of their paperwork.
2. Goal: Currently, the videos are an information overload. Many definitions are not listed on the slides. Viewers cannot rewind the video in the workshop. And the video does not directly correspond to sections of the forms that the users have to fill out. Eliminate all these problems by providing information in nugget-sized-proportions and tightly coupling this video experience with the forms.
NLT for Court Forms
1. Concept: Integrate the web forms with Google Translate, or some other legal translation software.
2. Goal: All forms must be submitted in English according to California State Law. Even if the forms are presented in Spanish, the user must respond in English — which poses a huge barrier without an interpreter. Instead, bring Natural Language Translation (NLT) systems to the user, so this form-filling process becomes much easier.
Symbolic Signage at Court
1. Concept: Replace English signs in help center with symbol-rich signs that are easier to understand and follow.
2. Goal: Symbol-rich signs will be able to better direct court users to get the forms they need and access the services they require. This will improve the physical experience of navigating the courthouse.
Brochure Placement
1. Concept: Redesign help center brochures to be color coded according to languages and then placed in different sections of the room, according to language.
2. Goal: By offering forms in both languages, court users can identify the right forms and will be able to understand them. They can then write their answers on the corresponding English language forms.
Robotic Assistants
1. Concept: Create mobile booths in different areas where people could lodge cases in their languages by speaking into a phone line which will then capture the information and translate it into English. The robotic booth will then print the documents which the user can scan and download through the mobile application.
2. Goal: Reduce trauma and negative attitudes towards the court system by promoting privacy of individuals coming to court.
Real Time Translation Services
1. Concept: Have tablets and headphones available for rent upon court entrance that guide you in your respective language to where you need to go (with pictures) and act as real-time translators with court actors.
2. Goal: Facilitate the processes of moving through the court and interacting with court personnel despite language barriers.

With these ideas in mind, we are going to spend next week whittling down these to five favorites, drawing out the ideas, and then interviewing individuals at the courthouse as to what they like and/or dislike about these potential solutions to language access problems.

Tags brainstorming, design for justice, language access

Class Blog Design Research Project updates

Observing a county court for language access

Post author By Margaret
Post date December 26, 2018
No Comments on Observing a county court for language access

Initial Observations at the Santa Clara Family Justice Center (Week 2)
By Sahil Chopra

During our second week of the course, we paid our first visit to the Santa Clara Family Justice Center in order to observe, explore, and immerse ourselves in the court experience. Our day at court was structured around exploring the self-help facilities before branching out into smaller, more intimate portions of the courthouse in smaller groups. My team drove down to the court and arrived at around 8:30 am, just as the self-help waiting room started to fill up. We jotted down a few stray observations before convening with the rest of our class in the lobby at 9:00 am, where our instructors Margaret and Jonty handed out a few Design Review pamphlets for our day at court, wherein we continued to write down our observations and thoughts.

Here are the highlights from our first trip to court. Next week, we shall pool our individual observations and insights, as we brainstorm what potential problems and solutions might be.

Self-Help Desk

Definition:

Many users do not have access to a lawyer, so the court provide a self-help desk, where individuals wait in a queue until court staff call up their ticket number and can help them address their problem — whether that be information about the filing process or guidance as to which forms must be filled out in order to proceed with their case. While the self-help desk provides an invaluable service, it is often understaffed. As a result, court users often lineup outside the Family Court around 7:00 am, though the center does not open till 8:30 am and does not start processing tickets until about 9:00 am. When it comes to language access, there is not much the self-help desk can provide on its limited budget. If one does not speak English, he/she/they must bring along a translator, a legal adult in the state of California, i.e. 18 years or older, who is preferably a relative. If they come without a translator, they will ultimately be turned away.

Highlights:

The self-help waiting room feels like a hybrid of the DMV and a doctor’s office. Everyone sits side-by-side, but in their own little-world. Entering the room, there are black chairs lining the perimeter of the room, except for the left-hand-wall, where there is a wall full of assorted forms. While it seemed very well organized, i.e. color-coded, accessible, etc., there were very few people who approached the wall to pick up flyers. Perhaps, the singular placement of all essential forms seemed overwhelming?

Sitting in the crowd, it was easy to spot parents who had brought their teenagers to help them with their paperwork. In hushed voices, I saw a sixteen year boy reading over an assortment of forms, quickly translating them to their mom. Translation services would help decongest the overflowing waiting room, by limiting the number of family members that would need to be brought along. Additionally, it would be beneficial for both the kids and the parents, if the children did not have to take time off school.

Workshop

Definition:

Throughout the week, there are several workshops that the self-help desk hosts, wherein the process for filing a specific motion is discussed and then assistance is provided with form-filling. It just so happened that our-visit coincided with a divorce workshop.

After spending some time in the self-help room a few of us decided to observe the workshop.

Highlights:

While we were sitting in the self-help room, one of the court staff came out and announced who made it into the workshop and who did not. It seemed a bit impersonal and harsh to be called out by name, especially when everyone knows the issue associated with your use of the court. But maybe, that helps normalize the act of getting help?

The informational portion of the workshop consists of a 50 minute, screen-capture powerpoint presentation and narration. It was interesting that there were more spots for the video portion of the workshop than the 1:1 assistance portion of the workshop, even though the latter part feels more important towards the goal of filing a motion. This discrepancy between max capacity and serviceable capacity highlights the need for more staff.

The PowerPoint video described the technical legal terminology and processes surrounding divorce. While informative, the video didn’t seem to be helpful. Within the room, one couple talked over the video — trying to fill out their paperwork, as the video played. Most of the other viewers seemed to pay attention for the first five minutes before sliding into their chairs and waiting out the remainder of the video’s runtime.

The first problem with the video is that it is entirely in English. If you don’t speak English well, you’ve just wasted 50-minutes that could have been spent getting help.

The second problem with the video is that it is too long and lacked participant engagement. It’s important to be precise and informative, especially when dealing with legal matters; but the video consisted of a powerpoint and a voiceover. There was no color and few pictures. Furthermore, it did not actually help with the process of filling out the forms. Without interactivity, the video failed to provide actionable instructions — thus failing its purpose of providing help to individuals who needed assistance in filing for divorce.

The third problem with the video is that it is unaccessible. It cannot be accessed outside the workshop, and even within the workshop it cannot be paused, rewinded, etc. Thus, it fails it’s purpose of being a 1-stop-reference for all things divorce-related. Additionally, the video was poorly constructed in that a lot of the important facts were spoken but never transcribed on the slides themselves, even though the slides themselves were full of text.

Possible Language Access/Self-Help Solutions

After sitting through the workshop, I think there is a lot low hanging fruit here, i.e. small changes that can be made to improve outcomes and scale the program — even in the face of budgetary issues.

Solution 1 (Low Overhead): There are many computers in the workshop room. Instead of making everyone watch the PowerPoint video together, provide every workshop-attendee a pair of headphones, so that they can pause and rewind the video wherever they want.

Solution 2 (Low Overhead): Split the presentation into digestible chunks. After each video section have the workshop-attendees fill out the respective portion of the form. This tight coupling is often used in flipped classrooms and should make the process more self-directed.

Solution 3 (Low Overhead): Post the video and presentation online. Let people view the contents and fill out the form digitally at home.

Solution 4 (High Overhead): Translate the presentation into several key languages, i.e. Spanish, Vietnamese, Korean, Hindi, Mandarin. This is a one-time job but would improve accessibility tremendously.

Miscellaneous Observations

After experiencing the divorce workshop first hand, we decided to sit on a few of the court hearings that were open to the public. Before, we headed up the stairs to the court rooms, I stepped away to get some water. In the five minutes that I was gone, my teammates encountered a Latino women, who could not speak English well. She was asking, where she could find the police; and it was only after a few exchanges that my teammates realized that she was looking for “something to keep [a person] away”, i.e. a restraining order. They then showed her the route to the appropriate court office, but it was apparent that there needs to better outreach within local cultural and ethnic communities in both discussing the purpose of the court, the terminology surrounding the court, and the services that it can provide. This might help reduce friction for those seeking support, especially not native speakers. Perhaps outreach at libraries, churches, and grocery stores might help with this problem.

Overall, I was surprised to see how calm and collected the judges were at responding and guiding the proceedings. It seemed as if they really cared about both parties involved. The empathy demonstrated was quite moving, especially given how messy some of the court cases were.

Tags court design review, design for justice, language access in courts

Class Blog Project updates

Identifying A Single Prototype for language access improvement

Post author By Margaret
Post date November 11, 2018
No Comments on Identifying A Single Prototype for language access improvement

By Sahil Chopra

(Part of a series of posts documenting the Design for Justice: Language Access class)

Entering home stretch of the Autumn quarter, we spent today’s class first synthesizing our findings and working on our final pitch to the California Judicial Council and then selecting one of our prototypes for further development.

To start the the synthesis process, we grabbed a whiteboard and divided it into two halves — with one side dedicated to answering “What we heard or saw?” and the other dedicated to answering “What do we do in response?” Starting with the former question, we started jotting down quotes and experiences we had catalogued over the past few weeks from our interviews and observations, before clustering them around common topics. This exercise yielded two incredibly salient themes that we hope to address with our revised prototype:

Time: People fear the courthouse, because it takes an inordinate amount of time and as a result deprives of economic and educational opportunity that they would be accumulating, had they not spent hours upon hours and days upon days within the courthouse. One woman we interviewed exclaimed that, “divorce right now is almost a full-time job”; while another lamented that the amount of time she had to spend in court affected her kids’ academics, as they had to accompany her so that she could have the proper assistance necessary to fill out the English-language forms.
The process of getting proper help seems to take too much time because the self-help desk is understaffed and because court users produce a large number of errors while filling out their paperwork. Many of the people have interviewed over the past several weeks have mentioned that they often spend several hours waiting to be helped, only to be told that they made a mistake in their documents and are then sent to back the of line to seek guidance.
As a result, we witness a vicious cycle. The self-help desk is constantly creating its own backlog of requests, ultimately increasing stress and time allotted per case — for both the clerks and the court users. As a result of this feedback, one of our primary goals is to reduce the amount of time that a user has to spend in order to fill and submit their proper paperwork. This will help users have a more pleasurable and accessible court experience, while reducing the stress upon the self-help clerks.

Language Barriers Are Multifaceted: One thing we did not realize until we began user testing was how multifaceted of a problem language barriers actually are. When presenting our “Redesigned-Form” prototype to non-native speakers last week, we established a situation where we asked our interviewees to file for divorce. On the second page of the prototype, we asked our court users to declare whether they wanted a “Divorce”, “Legal Separation”, or “Nullity” from a “Marriage” or “Domestic Partnership”. While it was clear to the user that they were on a page associated with divorce, they were unsure as to what the differences were between a “divorce”, “legal separation”, and a “nullity”. As a native English speaker these terms seem foreign, as they are rooted in precise legal terminology; so one apparent aspect of “language access” is to provide court users with simple language that unpacks these precise terms. But the problem with language access extends far behind legal terminology and words in different languages. There are often significant cultural barriers as well. When interviewing a technologically-savvy Uyghur woman, we saw her even tried opening Google Translate on her phone, writing the phrase, and having the service produce a Mandarin version of the text. The problem was, however, that the concept did not exist in her culture; so even though she had the translated phrase, the concept did not register. This highlights the fact that language access does not simply include English-barriers, but also cultural ones. We must overcome both in order to provide true access to court systems.

With this in mind, we shifted to the other half of the board, answering “What do we do in response?” Here are a few of the ideas from that brainstorm:

Split the current forms into manageable chunks so that we do not overwhelm court users and narrow context of any page down to a singular topic so that it become easier for a non-native speaker to identify the goal of the page, even if they struggle to understand the bulk of it.
Provide native-language instructions and definitions that unpack legal ease in laymen’s terms and pay attention to cultural differences, in their explanations of legal terms.
Add legal advice forums like r/legal-advice into the court website; and provide a platform for non-native speakers to voice their experiences to others within their communities. We heard from many younger court users, that they looked online to blogs in order to understand the experience they were about to undertake, as a user of the court. These blogs reassured them and provided guidance, when they were most confused. It would be cool to provide this type of support on the court website and extend it to non-native speakers.

Moving forward, we are going to further pursue our “Redesigned-Form” prototype, diving deeper into the Divorce Experience to provide a more nuanced prototype experience.

Tags design for justice, language access, Legal Design

Class Blog Project updates

Design for Justice: Language Access — an introduction in week 1

Post author By Margaret
Post date October 28, 2018
1 Comment on Design for Justice: Language Access — an introduction in week 1

by Sahil Chopra

Language is the medium by which we interact with culture, express our ideas, and maintain our rights. Without “language access”, i.e. the ability to convey one’s thoughts effectively and understand others correctly, one is disempowered altogether. At a societal level this can lead to systemic inequality, whether intentional or not; and one of the places where this is most evident is the court system.

This Autumn, I’m one of the 25 students enrolled in Stanford’s Design for Language Access, a course initiated by the Stanford Legal Design Lab to investigate and advise how state courts may better serve Californians entering the legal system, who either do not speak or have limited proficiency with English.

As the Judicial Council of California’s Strategic Plan for Language Access in California Court details, 40% of Californians speak non-English languages at home, 200+ languages and dialects are spoken by Californians as a whole, and approximately ~20% of Californians have English language limitations. Going to court is always a stressful experience, as the impetus to seek court help is often a difficult circumstance itself. Coupling the weight of the incident with the inability to communicate and properly resolve your issue only magnifies the stress incurred by the individual. Moreover, it may be difficult to properly resolve one’s legal issue and receive the proper access to one’s legal rights if they are unable to effectively communicate with lawyers, clerks, and judges within the judicial branch. Thus, “language access”, as the Judicial Council of California titles it, is a critical issue that we must address in order to ensure and fair and equitable legal proceedings.

Personally, I have no prior background with judicial systems. I’m a computer scientist by training, completing my BS/MS with concentrations in Artificial Intelligence and Human Computer Interaction – focusing a bulk of my research in cognitive science and natural language processing. But that’s where the diverse experience of my classmates come in. We are lawyers, teachers, designers, business students, and computer scientists — all hoping to better understand this space and offer a different perspective.

Over the next nine weeks, we shall apply the fundamental principles of “Design Thinking” to first observe and interview individuals going through the court system and then hypothesize, prototype, and test potential strategies that may provide better language access to millions of Californians. Our class will culminate in a list of possible solutions and implementations which the California courts may consider as potential avenues by which the state can improve language access at scale. Additionally, we shall be evaluating a pilot program that California courts is running in San Jose, where tablets with Google Translate are being employed to help ease communication between non-English-speaking clients and English-speaking court staff.

Stay tuned to learn more week-by-week about our journey to help provide better language access to Californians!

Tags Court design, design for justice, language access, Legal Design Lab

Class Blog Project updates

The evolution of an eviction self-help website

Post author By Margaret
Post date May 22, 2018
No Comments on The evolution of an eviction self-help website

by Margaret Hagan, also published at Legal Design and Innovation

Along with Daniel Bernal, I’ve been teaching a Stanford d.school pop-up class, Design for Justice: Eviction. We’ve been working with a team of 10 students and a network of experts, legal aid groups, and courts, to plan out new ways to support people who have received eviction notices.

Design for Justice: Eviction concept board for self-help

The challenge of the class is: what can we provide to people who have just received an eviction summons and complain in the mail, to help them understand their rights and feel empowered to show up to court?

In the first class, Daniel laid out the background research and concepts. He is working on his PhD with this challenge as his focus, and he got the team up to speed on the current legal landscape and self-help offerings.

From there, our student teams began scoping hypotheses — new insights and concept designs of what could address the challenge. Then, within a month, we vetted these with our network of experts, to get their ranking of importance and viability. And our designers and developers sprinted to create medium-fidelity, working versions of the concepts that were vetted.

This past weekend, we subjected these mid-fidelity prototypes to user testing, with people who have been evicted previously. We’ll be writing up our findings more thoroughly later — but for now we just wanted to show the evolution of one design over a month.

The evolution of a legal self-help website

One of the main vehicles for this self-help will be a website. Here is it’s journey in sketches and images.

October 24th: At a lunchtime, test-run workshop, a student/faculty team proposes an online resource for tenants facing eviction

Jan. 19th: at class planning meeting, our teaching team sketches out one of the possible website prototypes that might emerge during our class

April 28th: At our proper class, one of the students, David, creates a one-page concept sketch of an interactive self-help website

May 3: plotting all the possible functions that could go on a page

May 11: Boiling down all the functions to a cleaner flow, simple sketch — the start of some color

May 13th: Focusing the details, the messaging of the culled down website

May 18: Getting specific (though a little messy) about 3 different variations to test against each other

May 19th: Testable live version of website — not perfect in terms of content or visuals, but a skeleton of the functions and flow we’re looking for

We had 3 different versions of the May 19th website — we’ll be streamlining these based on feedback into one higher-fidelity site. We’re digesting all the user feedback we received at our latest testing session to redraft the site. All this is aiming towards a trial that Daniel will run over the summer of new self-help interventions, likely including a website, to see how people engage and use them in real life.

The evolution will continue, stay tuned!

Tags Daniel Bernal, design for justice, eviction, website

Introduction

1: The Use Case – AI-Assisted Housing Intake

Defining the Use Case of Intake & Screening

Why LASSB Chose Housing Intake

Why Intake Matters for Access to Justice

Why AI Is a Good Fit for Housing Intake

2: Status Quo and Future Vision

Current Human-Led Workflow

User Personas in the Workflow

Pain Points in the Status Quo

Envisioned Human-AI Workflow

Feedback-Shaped Vision

3: Prototyping and Technical Work

Initial Concepts from Autumn Quarter

Revised Scope in Winter

Prototype Development Details

Key Design Decisions

Trauma-Informed Questioning

Tone and Language

Guardrails

User Flow and Interface

Risk Mitigation

Training and Iteration

4: Evaluation and Lessons Learned

Evaluating Quality and Usefulness

Good-Bad Rubric on Screening Performance

Testing the prototype

Benchmarks for AI Performance

Feedback from Attorneys and Technologists:

What Worked Well in testing

What Failed or Fell Short in testing

Why These Issues Arose

Lessons on Human-AI Teaming and Client Protection

Section 5: Recommendations and Next Steps

Immediate Next Steps for LASSB:

Toward Pilot and Deployment

Broader Lessons for Replication

Start Small and Specific

Human-Centered Design is Key

Combine Rules and AI for Reliability

Don’t Neglect Data Privacy and Ethics

Benchmark and Share Outcomes

Policy and Regulatory Considerations

Maintain the Human Touch

Policy and Ethical Considerations

Confidentiality & Data Security

Informed Consent and Transparency

Guarding Against Improper Gatekeeping

Bias and Fairness

Accuracy and Accountability

UPL and Ethical Practice

The Housing Accommodation Demand Letter Task

Why is the Demand Letter Task a good fit for AI?

Workflow Vision:

Current Demand Letter Workflow (Status Quo)

Proposed AI-Assisted Workflow (Future Vision)

Technical Approach and Prototyping: What We Built and How It Works

Early Prototype and Pivot

Final Prototype Design

Data and Knowledge Integration

Prompt Engineering

Trauma-Informed and Bias-Mitigation Features

Automated Tools Integration

Draft Letter Generation

Testing and Iteration

Getting the agent to perform consistently

Ensuring the agent produced the right output

Un-sticking the agent, caught in a loop

Dealing with fake or inaccurate info

Getting consistent letter formatting

Evaluation Framework: Testing, Quality Standards, and Lessons Learned

Evaluation Methods

Internal Performance Testing

Expert Review (Quality Control)

User Feedback

Long-term Monitoring

Risk and Ethical Considerations

Data Privacy & Security

Bias and Fairness

Acceptance by Courts/Opposing Parties