As new services and tech projects launch to serve the public, there’s a regular question being asked:
- How do we measure if these new justice innovations do better than the status quo?
- How can we compare the risk of harm to the consumers by these new services & technologies, as compared to a human lawyer — or compared to no services at all?
This entails diving into the discussion of legal services mistakes, risks, harms, errors, complaints, and problems. Past discussions of these legal service problems tend to be fairly abstract. Many regulators & industry groups focus on consumer protection at the high level: how can we protect people from low-quality, fraudulent, or problematic legal services?
This high-level discussion of legal service problems doesn’t lend itself well to specific measurements. It’s hard to assess whether a given lawyer, justice worker, app, or other service-tech tool is more or less protective of a consumer’s interest.
I’ve been thinking a lot about how we can more systematically and clearly measure the quality level (and risk of harm) of a given legal service. As I’ve been exploring & auditing AI platforms for legal problem-solving, this systematic evaluation is needed to be able to assess the quality issues on these AI platforms.
Measuring Errors vs Measuring Consequences
As I’ve been reading through work in other areas (particularly health information and medical systems), I’ve found the work of medical & information researchers to be very instructive. See one such article here.
One of the big things I have learned from medical safety analysis has been the importance of separating the Mistake/Error from the Harm/Consequence. Medical domain experts have built 2 sets of resources:
- Taxonomies of provider errors and safety events (where a provider or technology makes a mistake, or has a ‘concerning event’). See some examples of these medical problem events here.
- Taxonomies of user harms and consequences (what the patient experiences after the error event has occurred). See these harms, ranked from most serious to least.
This is somewhat of a revelation: to separate the provider error from the user harm. Not all errors result in harm — and not all harms have the same severity & level of concern.
As I am studying AI provision of legal services is that AI might make an error, but this does not always result in harm. For example, the AI might tell a person the wrong timeline around eviction lawsuits. The person might screenshot this incorrect AI response and send it their landlord – “I actually have 60 days to pay back rent before you can sue me – see what ChatGPT says!”. The landlord might cave, and give that person 60 days to pay back rent. The user hasn’t experienced harm, even though there was an error. That’s why it’s worthwhile to separate these problems into the Mistake and the Harm.
Planning out a protocol to measure legal services errors & harms
Here is how I have been developing mistake-harm protocol, to assess legal services (including AI platforms answering people’s questions). Here is a first draft, that I invite feedback to:
Step 1: Categorize what Legal Service Interaction you’re assessing. Does the legal service interaction fit into one of these common categories?
- Provision of info and advice in response to a client’s description of their problem, including statement of law, listing of options, providing plan of steps to take (common in brief services, hotlines, chats, AI)
- Filling in a document or paperwork that will be given to court or other party, including selection of claims/defenses
- Intake, screening about whether the service can help you
- Prep and advocacy in a meeting, hearing, mediation, or similar
- Negotiation, Assessment of options, and Decision advice on key choices
- (Meta) Case Management of the person’s problem, journey through the system
- (Meta) Pricing, billing, and management of charges/payments for the service
Step 2: Categorize what Problem or Mistake has happened in this interaction (with the thought that we’ll have different common problems that happen in these different service interactions above)Preliminary list of problems/mistakes
- Provider supplies incorrect (hallucinated, incorrect jdx, out of date, etc) info about the law, procedure, etc
- Provider supplies correct info, but in a way that user does not understand enough to make wise choice
- User misinterprets the provider’s response
- Provider provides biased information or advice
- User experiences provider as offensive, lack of dignity/respect, hurtful to their identity
- Provider incorrectly shares private data from user
- Provider is unreasonably slow
- Provider charges unreasonable amount for service
Step 3: Identify if any Harm or Consequence occurred because of the problem. Acknowledging that not all of the situations above result in any harm at all – or that there are different degrees of harm.Possible harms that user or broader community might experience if the problems above occur.
- User does not raise a claim or defense that they are entitled to, and might have gotten them a better legal judgment/outcome.
- User raises an inapplicable claim, cites an incorrect law, brings in inadmissible evidence – makes a substantive or procedural mistake that might delay their case, increase their costs, or lead to a bad legal judgment.
- User spends $ unnecessarily on a legal service.
- User’s legal process is longer and costlier than needed.
- User brings claim with low likelihood of success, and goes through an unnecessary legal process.
- User’s conflict with other party worsens, and the legal process becomes lengthier, more expensive, more acrimonious, and less likely to improve their (or their family’s) social/financial outcomes.
- User feels legal system is inaccessible. They are less likely to use legal services, court system, or government agency services in future problems.