Hallucination
The model fabricates. Confidently. Completely. And you would never know.
The model has no concept of truth. It has patterns. When asked for a case citation, a statute, a product specification, a medical study, it does not search a database. It predicts what a case citation would look like — the volume, the reporter, the page number, the year, the holding. It has seen millions of real cases. The prediction is often correct. Sometimes the case exists, is real, and is directly on point. The problem is that you cannot tell which. A hallucinated citation and a real one look identical. The model does not know the difference. It cannot know the difference. This is not a bug. It is what the technology is.
Here is the pattern. You ask for supporting case law. You get five citations, perfectly formatted. You are busy. The response looks authoritative. You use it. One of those cases does not exist. Maybe three of them do not exist. You find out when opposing counsel checks your brief, or when the judge asks you about it from the bench.
This has happened. It will keep happening. The model was not lying to you. It does not lie. It predicted. The prediction was wrong. The professional responsibility was yours.
In 2023, an attorney submitted a brief to a federal court in the Southern District of New York. The brief cited several cases produced by ChatGPT. The cases did not exist. The court ordered the attorney to explain. Sanctions followed. The case is Mata v. Avianca, No. 22-cv-1461 (S.D.N.Y. 2023). It is not an edge case. It is a preview.
Read the opinion →Hallucination is not a lawyer problem. It is a prediction problem. Any professional who uses AI output in work that matters — and does not verify it — has the same exposure.
In 2025, researchers at the Icahn School of Medicine at Mount Sinai designed an experiment that flipped the usual concern on its head. Instead of asking whether AI would hallucinate on its own, they asked whether AI would catch a hallucination fed to it. They gave six large language models 300 clinical scenarios — each with a single fabricated detail buried inside. A made-up syndrome called Casper-Lew Syndrome. A fictional lab test called serum neurostatin. A non-existent finding called cardiac spiral sign.
The models caught almost none of them. In up to 83 percent of cases, they accepted the fiction, built on top of it, and produced confident clinical assessments anchored to conditions that do not exist. Diagnoses. Treatment plans. Follow-up recommendations. All structured. All wrong.
You are worried about whether the AI will hallucinate. You should also be worried about whether it will catch yours. A typo in a patient history. A misremembered dosage. A copy-paste error from a prior chart. The model will not flag it. It will incorporate it and keep going.
Read the study (Communications Medicine, August 2025) →You assumed I would be the one making things up. That is fair. But when you handed me something fictional, I did not notice either. I built you a treatment plan for it. It was very thorough.
In October 2025, academics discovered that a report Deloitte had submitted to the Australian government contained fabricated academic sources, citations to non-existent books, and a made-up quote attributed to a Federal Court judge. The report — a 237-page review of a welfare compliance system — had cost A$440,000. It had been published on a government website for months before anyone noticed.
The researcher who caught it, Chris Rudge at the University of Sydney, said he knew instantly when he saw a citation attributing a book to a colleague in a field she does not work in. Deloitte later confirmed the errors, acknowledged it had used Azure OpenAI GPT-4o in drafting the report, and issued a partial refund. An Australian senator called the errors “the kinds of things that a first-year university student would be in deep trouble for.”
If a Big Four firm with institutional review processes can submit fabricated citations to a government, your review process is not structurally better. The only advantage is knowing to look.
Read the reporting (Fortune, October 2025) →They had four hundred and forty thousand reasons to check. They did not. You will have fewer reasons and less time. I recommend checking anyway.
What to do instead
- Never cite a case you have not read yourself.
- Treat every AI-generated citation as unverified until you confirm it exists in Westlaw, Lexis, or a court database.
- Ask the model to give you the full citation — case name, volume, reporter, page number, year, court. Then verify it exists before you use it. A real case takes thirty seconds to confirm. A fabricated one returns nothing.
- Ask the model to explain the holding — hallucinated cases tend to fall apart under follow-up questions.
- Use AI to find leads, not to close research.
- Your signature on the document is your verification. Act accordingly.
Verify Before You Cite
Paste the case name or citation into either of these. A real case comes back. A fabricated one doesn’t. Thirty seconds.
Many state bars provide free access to FastCase as a member benefit — check with yours. If you have Westlaw, Lexis, or Bloomberg Law, use those. They are the gold standard.
CourtListener is free and covers most federal cases, but its coverage is not complete. Older cases, some state decisions, and cases with complex titles may not appear even when real. A result confirms existence. No result is inconclusive — not proof of fabrication. When it matters, Westlaw, Lexis, or Bloomberg Law are the gold standard.
Not finding a real case costs you time. Failing to find a fake case costs you your license. The asymmetry matters. When in doubt, pay for the search.
The Confident Liar
A game that generates realistic-looking case citations — some real, some fabricated. Your job is to spot the difference before you submit the brief. Difficulty increases. The consequences feel real.