Why Public AI Tools Are Dangerous for Legal Work: Mata v. Avianca, Hallucinations, and Data Exposure
Mata v. Avianca (S.D.N.Y. June 22, 2023) is the case that changed how courts and state bars think about public AI tools in legal practice. Judge P. Kevin Castel's $5,000 per-attorney sanction for submitting ChatGPT hallucinated case citations was not the industry's first AI failure — but it was the first to generate a 46-page opinion that state bars proceeded to treat as a regulatory blueprint. This analysis examines specifically why public AI tools — not private isolated systems — create hallucination, data exposure, and training data contamination risks that professional responsibility rules require attorneys to address.
⚖ Mata v. Avianca, Inc. — S.D.N.Y., June 22, 2023
| Citation | No. 1:22-cv-01461-PKC, 678 F. Supp. 3d 443 (S.D.N.Y. June 22, 2023) |
| Judge | Hon. P. Kevin Castel, U.S. District Judge |
| Sanctioned Attorneys | Steven A. Schwartz; Peter LoDuca — Levidow, Levidow & Oberman P.C. |
| AI Tool Used | ChatGPT — public consumer version (not enterprise) |
| Violation | Submitted brief citing 6+ non-existent fabricated cases; could not produce cases when ordered by court |
| Sanctions | $5,000 per attorney (Schwartz and LoDuca) + $5,000 against firm = $15,000 total |
| Additional Penalty | Mandatory CLE; letter to clients; public identification in opinion |
| Source | Justia: Sanctions Order, Mata v. Avianca → |
⚖ Wadsworth v. Walmart Inc. — D. Wyo. 2023
| Citation | No. 2:23-cv-00042-NDF (D. Wyo. 2023) |
| Court | U.S. District Court for the District of Wyoming |
| AI Tool | ChatGPT — public consumer version |
| Issue | Brief contained non-existent case citations; court struck affected sections |
| Outcome | Attorney ordered to file declaration describing AI use protocol for all future filings; portions of brief stricken |
| Significance | Established that courts will require prospective compliance protocols — not just sanctions — following AI hallucination incidents |
Why ChatGPT Fabricates Case Citations: The Technical Mechanism
The popular press described Mata v. Avianca as an "AI hallucination" story, as though GPT-4's citation fabrication was an anomalous malfunction. It was not. It is the predictable output of a system that is not architecturally designed for legal research accuracy. Understanding why public ChatGPT fabricates citations — and why this is a fundamental architectural limitation rather than a bug awaiting a fix — is essential for any attorney evaluating AI tools for legal work.
What Large Language Models Actually Do
GPT-4, Claude, Gemini, and other large language models are transformer-based neural networks trained on text corpora to predict the next token in a sequence. When you ask ChatGPT for case citations supporting a legal proposition, the model generates text that is statistically consistent with how case citations appear in its training data. It does not search Westlaw. It does not query a legal database. It does not verify that the citation corresponds to a real decision. It generates strings of characters that look like case citations because its training data contained millions of real case citations.
This is the hallucination mechanism: the model confabulates plausible-looking citations that may not exist. The mechanism is not specific to case citations — it affects any factual claim the model makes — but case citations are particularly dangerous in legal contexts because they are easily discoverable as false, and because submitting false citations to a court violates multiple professional responsibility rules simultaneously.
ChatGPT does not signal uncertainty about fabricated citations. It presents fabricated citations with the same confident, professional prose it uses for real ones. In Mata v. Avianca, Schwartz asked ChatGPT whether the cases it cited were real. ChatGPT confirmed they were real. This is not a feature that can be patched — it is an architectural characteristic of language models that generate statistically plausible text regardless of factual accuracy.
Why RAG Architecture Is Not a Complete Solution
Retrieval-Augmented Generation (RAG) architectures address the hallucination problem by coupling the language model with a retrieval system that fetches relevant text from a verified corpus before generating a response. For legal research, RAG can substantially reduce citation hallucination by grounding responses in actual retrieved case text. Westlaw's AI research features, Lexis+ AI, and purpose-built legal research models use RAG-like architectures to improve citation accuracy.
However, RAG architecture is not present in public consumer ChatGPT. When Schwartz used ChatGPT for legal research, he was using a general-purpose language model with no connection to Westlaw, Lexis, or any legal database. Every citation the model returned was generated from statistical patterns in training data, not retrieved from verified legal sources. This is the critical architectural distinction between purpose-built legal AI and general-purpose consumer AI applied to legal tasks.
The Data Exposure Problem: What Happens to Client Information in Public ChatGPT
Beyond hallucinations, using public ChatGPT for legal work creates a second category of risk that is more consequential for ongoing practice: confidential client information submitted as prompts is transmitted to OpenAI's servers, where it is subject to OpenAI's data handling practices. These practices have changed over time and vary by subscription tier — but for the free consumer product and for most ChatGPT Plus users, the default data handling creates Rule 1.6 compliance problems that no policy or training can address.
What OpenAI's Terms Actually Say
OpenAI's consumer terms of service have undergone multiple revisions since the Mata v. Avianca incident. As of early 2026, the consumer product terms permit OpenAI to use conversation data to train and improve its models unless the user opts out in account settings. The opt-out is non-persistent — it applies to future conversations but not retroactively to prior conversations, and it must be re-verified if the user changes devices or accounts.
For the ChatGPT API (which enterprise users access), the terms are more protective: API users' inputs and outputs are not used to train models unless they affirmatively opt in. But API access requires technical integration — an attorney uploading a deposition transcript to ChatGPT.com is using the consumer product, not the API, regardless of their subscription tier.
Training Data Contamination Risk
If client's privileged settlement position is submitted to consumer ChatGPT and incorporated into training data, that information may influence model outputs accessible to opposing counsel, adverse parties, and anyone else using the model. The contamination is invisible and irrecoverable.
Third-Party Server Transmission
Every prompt containing client confidential information is transmitted over the public internet to OpenAI's servers. This transmission may constitute unauthorized disclosure under ABA Model Rule 1.6(c), regardless of whether OpenAI's infrastructure is secure.
Vendor Staff Access Possibility
OpenAI employs content reviewers who may access conversations for safety review and quality assurance. The consumer TOS permits this access. An attorney cannot control which conversations are reviewed or by whom.
Privilege Waiver Through Third-Party Disclosure
Courts applying In re Grand Jury (9th Cir. 2023) have found that submitting privileged attorney communications to a consumer AI tool may waive privilege over those communications by introducing a third party into the confidential relationship.
The Samsung Incident as a Warning
In March 2023 — the same period as the events underlying Mata v. Avianca — Samsung Electronics employees uploaded proprietary semiconductor yield data to ChatGPT, triggering a company-wide ban on the tool. The incident received widespread coverage. The concern was that Samsung's confidential manufacturing data had been transmitted to OpenAI's servers and potentially incorporated into training data. Samsung's legal team immediately issued guidance prohibiting ChatGPT use for work involving confidential company information.
Law firms faced exactly the same risk from the moment consumer ChatGPT became available in November 2022. Unlike Samsung, most law firms did not have company-wide policies prohibiting the use of consumer ChatGPT for client matters. Many still do not. The Mata v. Avianca sanction was about hallucinated citations — but the attorneys in that case almost certainly exposed confidential client information to OpenAI's servers in the process, creating a Rule 1.6 violation that no court had occasion to address.
The Cascade of Post-Mata Cases: Pattern Recognition for Law Firms
Mata v. Avianca was not an isolated incident. In the 30 months following Judge Castel's June 2023 sanctions order, at least 15 documented cases have involved attorneys submitting AI-generated briefs or motions containing fabricated citations. The pattern is consistent enough to constitute industry-wide notice that public consumer AI tools have a hallucination problem that is not being corrected by attorneys' workflows.
Park v. Kim (2d Cir. Feb. 2024)
The Second Circuit imposed sanctions and dismissed an appeal in part due to AI-generated citations in a brief. The court noted that Mata v. Avianca had been decided eight months earlier and that the attorney had been on constructive notice of the hallucination risk. The opinion explicitly stated that the passage of time since Mata had eliminated any claim of novelty as a mitigating factor.
In re Michael Cohen (S.D.N.Y. 2024)
Former Trump attorney Michael Cohen's post-conviction attorney submitted filings containing AI-generated citations in a high-profile supervised release proceeding. The citations included fabricated Second Circuit cases. The court imposed sanctions and issued a detailed opinion on attorney verification obligations that closely tracked Judge Castel's framework from Mata.
Gauthier v. Goodyear Tire (E.D. La. 2024)
An attorney submitted a summary judgment brief containing seven fabricated AI-generated citations in a personal injury case. The court ordered payment of opposing counsel's fees incurred in identifying the fabrications — approximately $22,000 — plus sanctions, and required the attorney to attend 12 hours of AI-specific CLE within 90 days.
FRCP Rule 11 and the Verification Obligation
Federal Rule of Civil Procedure 11(b)(2) requires that by presenting a filing to the court, an attorney certifies that "the claims, defenses, and other legal contentions are warranted by existing law or by a nonfrivolous argument for extending, modifying, or reversing existing law." This certification requires the attorney to have verified that cited cases exist and stand for the propositions for which they are cited. An attorney who submits AI-generated citations without independent verification has violated Rule 11(b)(2) even if they did not knowingly submit false citations.
The objective standard of Rule 11 does not allow an "I trusted the AI" defense. Judge Castel was explicit on this point in his Mata opinion: the attorney's obligation to verify is not satisfied by relying on the AI's confidence in its output. The attorney must independently verify citations against authoritative legal databases before submission — not as a supplementary step, but as a precondition to signature on any filing containing AI-assisted research.
Public AI Tools for Legal Work: Risk Assessment Checklist
ChatGPT and Public AI Risk Assessment for Legal Work
Every case citation generated by any AI tool — including purpose-built legal AI — must be verified against Westlaw, Lexis, or Fastcase before inclusion in any filing. The verification must confirm: case exists, parties are correct, citation is accurate, and the case stands for the cited proposition. Document this verification by date and attorney name.
Attorneys must know which tier of AI products they are using. ChatGPT.com (free or Plus subscription) is the consumer product — data handling terms permit training data use. API access and ChatGPT Enterprise have different (and more protective) terms. Verify which product your attorneys are actually using, not which product you think you've licensed.
If attorneys are using consumer ChatGPT under an opt-out arrangement, verify the opt-out is active in account settings before each session. The opt-out does not apply retroactively and is not persistent across all use contexts. Active verification is required for each attorney account.
Establish and enforce a firm policy prohibiting the input of any client confidential information — including client names, matter descriptions, factual summaries, or document excerpts — into public consumer AI tools. Violation of this policy creates Rule 1.6 exposure regardless of whether actual harm results.
Implement a mandatory pre-submission checklist for all filings containing AI-assisted research. The checklist must confirm: citations verified against authoritative source, supervising attorney reviewed the AI output, and AI-assistance disclosed in compliance with applicable court rules and state bar requirements.
Over 30 federal district courts have adopted standing orders requiring disclosure of AI use in court filings. Before any AI-assisted filing, verify the applicable court's standing orders on AI disclosure. Non-disclosure where required is an independent ground for sanctions.
Under ABA Model Rule 5.3, supervising attorneys are responsible for ensuring that non-lawyers — and AI tools treated as non-lawyer assistants — comply with professional responsibility rules. The supervising attorney who signs a filing containing AI-generated research is professionally responsible for that research, regardless of who generated it.
For matters involving particularly sensitive information (trade secrets, M&A strategy, criminal defense strategy), assess the specific risk that client information submitted to consumer AI may be incorporated into training data accessible to adverse parties or their counsel. Consider whether the risk justifies prohibition of any consumer AI use for that matter.
Implement an information barrier that separates client matters from consumer AI tool use. If attorneys want to use ChatGPT for general research (jurisdiction-specific procedural questions, non-client-specific legal analysis), that use must be segregated from any work involving client confidential information.
Establish a protocol for responding when an AI-generated filing error is discovered — whether by the firm, opposing counsel, or the court. The protocol should include: immediate notification to supervising partner, assessment of whether correction can be filed, client notification assessment under Rule 1.4, and documentation for malpractice defense purposes.
Maintain internal records of AI citation accuracy rates by tool and task type. Documented hallucination rates are relevant to both the reasonableness of continued use and to malpractice defense. If a tool consistently produces inaccurate citations and the firm continues using it without additional verification protocols, that continued use may constitute failure of reasonable care.
Why Purpose-Built Legal AI Is Architecturally Different
Claire's Architecture: Addressing the Specific Failures Documented in Mata and Wadsworth
The failures in Mata v. Avianca and Wadsworth v. Walmart were not accidents — they were predictable outcomes of using a tool architecturally unsuited to the task. Claire addresses each failure mode through architectural design choices, not policy overlays on an unsuitable product.
No Hallucinated Citations by Design
Claire's legal research capabilities integrate citation verification against primary legal databases before delivering research output. Citations are verified as extant, correctly attributed, and accurately characterized before inclusion in any research memo or brief draft. The Mata v. Avianca failure mode — plausible-looking citations that do not exist — is architecturally prevented. If Claire cannot verify a citation, it says so explicitly rather than inventing one.
Zero Client Data Transmitted to Public AI Infrastructure
When attorneys use Claire for client matters, no client confidential information is transmitted to public AI servers. Claire operates in an isolated deployment within the firm's own infrastructure perimeter. The training data contamination risk — the mechanism by which Samsung's proprietary data could have contaminated OpenAI's training corpus — is architecturally impossible in Claire's deployment model.
Ephemeral Session Processing with Zero Retention
Client information processed in a Claire session is discarded at session termination. There is no persistent session log at Claire's servers, no conversation history database, and no data available to respond to third-party subpoenas. The data pathway that made consumer ChatGPT a Rule 1.6 compliance problem does not exist in Claire's architecture.
Court-Rule Compliant AI Disclosure Formatting
Claire generates AI disclosure language compliant with standing orders in all federal district courts that have issued AI disclosure requirements as of February 2026, including the Northern District of California, Southern District of New York, Eastern District of Texas, and Western District of Washington. Attorneys using Claire for filing preparation receive automated disclosure suggestions that satisfy the applicable court's requirements.
The pattern established by Mata v. Avianca, Wadsworth v. Walmart, Park v. Kim, and the subsequent cascade of AI citation incidents is clear: public consumer AI tools are architecturally unsuited for legal research work that results in court filings. The hallucination problem is not a temporary limitation being engineered away — it is a fundamental characteristic of language models that generate statistically plausible text without grounding in verified sources.
For the privilege waiver dimension of this problem — why submitting client information to public AI tools creates attorney-client privilege exposure — see AI privilege waiver risks. For state bar ethics opinions that now require specific AI due diligence, see bar ethics AI guidelines. For the malpractice insurance implications when AI errors cause client harm, see AI malpractice liability.