AI Hallucinations in Legal Research Are a Serious Problem … for Now

We’ve all seen headlines about lawyers submitting briefs to the court only to find out that their writings have references, quotes and citations that were completely fabricated. This is the current state of using generative artificial intelligence (GenAI) tools like ChatGPT, Claude and others to completely write legal briefs and conduct your legal research. It would be one thing if this were the result of inexperienced lawyers trying to find shortcuts to lessen a heavy workload, but instead we are seeing this from very experienced lawyers who should know better.

By now, we know that GenAI hallucinates. These tools, which ingest trillions of words from the Internet, are designed to be interactive, confident and helpful. So confident and helpful that it will go out of its way to make sure you are happy with the answers it provides you. Sometimes to meet this goal, it has to make stuff up. This is the hallucination problem. It is very real. It is very serious. And it is most likely very temporary.

I talked with three legal technology professionals from UK-based law firm Travers Smith for a recent podcast episode of The Geek in Review. Oliver Bethell, Shawn Curran and Sam Lansley understood the power and the problems of GenAI as soon as ChatGPT began making headlines in November of 2022. They moved in two directions at once. First, blocking ChatGPT within their firm, thereby preventing anyone from unintentionally sharing any sensitive data that could be used by OpenAI to train their models. But at the same time, they began developing ways to take advantage of GenAI’s abilities to assist in the day to day work of reviewing documents, drafting letters and advising efficient ways to get non-client work accomplished.

To be clear, even today, they do not use GenAI as a solution for client work. It is too risky. For now.

The way the trio of legal technologists are approaching how to use GenAI within the law firm is by creating a combination of a secure, internal AI Chatbot and establishing an online community for the firm to share ideas, bounce issues off one another and to create safety protocols to prevent the sharing of proprietary, confidential, privileged or personally identifiable information. And this is where the team does something that most law firms would not even think of doing. They are sharing this tool with the world.

Oliver Bethell said that his team has “a longstanding history, that we’re really proud of, of open-sourcing technologies.” Bethell branded this AI chatbot YCNBot. With “YCN” being short for “Your Company Name.” Within Travers Smith, it is called TSBot. As other firms install the open source AI bot in their own environment, then they can call it by their firm’s name. It is not unusual for the larger tech community to open source products, but it is not something very common in the legal industry.

While this current approach may be designed in a way to “keep honest people honest,” it still does not address the issue of GenAI and hallucinations. However, Shawn Curran thinks he may have an eventual answer to that.

Currently, legal information providers are addressing the issue of GenAI hallucinations with what is called Retrieval Augmented Generation (RAG). While this does reduce the chances of the GenAI hallucinating, it does not bring that chance down to zero. In addition, it also slows results to a painful crawl. Asking basic legal research questions to systems that apply RAG techniques can result in three to five minutes passing before an answer comes back. And that answer may not be exactly what you thought you requested. Unlike ChatGPT where you can follow up your initial request — known as a “prompt” — you will need to start over and again wait three to five minutes for the next result. In an age where we are used to Google queries coming back with millions of results in less than a second, legal GenAI research tools seem almost unusable because of the slowness.

Curran thinks that he may have an answer to address not only the hallucination issue but perhaps also the slowness and interactivity problems that plague current GenAI legal tools. And this idea is called multi-tokenization.

To oversimplify the concept, tokens are how GenAI structures the large language model (LLM) that it uses to train itself. While the LLM may contain trillions of words scraped from the Internet, that massive amount of data is split into tokens. It is through this structure that the AI is taught how to probabilistically guess the next word in its response. It is what gives GenAI both its creativity and its ability to hallucinate. Curran, Lansley and Bethell wrote about the problem with this current structure in their recent paper, “Hallucination Is the Last Thing You Need.”

In their research, they ran multiple tests on how well GenAI tools could come back with correct quotes from common law cases. The sad results of this test were that only about 1 in 20 results had the correct quotes from the source material. For legal research, a 5 percent accuracy rate is simply unacceptable.

This is where Curran came up with an idea: Instead of breaking the source material into tokens where the AI is producing what it predicts to be the next word, the source material should be broken into multi-tokens where known quotes or sections of text are kept intact. Quotes or citations that are cited over and over again in legal text will be multi-tokenized. This results in the multi-token trained AI producing results that are far more accurate than its tokenized counterpart.

Perhaps this still does not get the GenAI hallucination problem to go completely away, but it is a potential big step in the right direction. In addition, it would also speed up results and allow for a more chat-like experience for the users.

While multi-tokenized GenAI tools are still in the theoretical stage, we’ve all experienced the staggering speed of change in the past year with GenAI tools. Whether it is multi-tokenization or some other advancement, the hallucination problem will eventually be reduced to an acceptable risk for the legal industry. Just like email, databases, Google and the cloud, the legal industry will adopt and adapt to new technologies that become commonplace in all industries. Unlike these previous changes, the legal industry will not have years or decades to contemplate how to implement GenAI.

Now is the time to look to how teams like those at Travers Smith are approaching this inevitable disruption that GenAI will bring to the legal industry. Be aware of what is going on with GenAI. Protect your information as well as your client’s. Create teams or collaborate with peers to share best practices. And work with reputable legal information providers on how to safely and securely test out GenAI features.

While there are serious problems and risks associated with today’s GenAI in the legal industry. Those risks will quickly fade and will be replaced by the risk of being left behind as others adapt the advancements that GenAI tools bring to the practice of law.

Greg Lambert is the chief knowledge services officer at Jackson Walker, a founder of “3 Geeks and a Law Blog” and the host of the podcast “The Geek in Review.”

Who We Are

Stay Connected

Our Partners