We recently ran a survey of legal professionals to test something we’d been hearing anecdotally from our customers for a while: evaluating whether AI tools will actually deliver on their promises is genuinely difficult. The results have been confirmed. 3 out of 4 general counsels and legal leaders that we surveyed agree that it is very challenging to assess the performance of legal AI tools, and over half of the survey respondents have been asked to do exactly that.
For legal teams already stretched thin, this is an obstacle to making good technology decisions.
So what makes evaluating AI tools so hard? And what should legal leaders actually be looking for?
Most respondents to the survey noted that their frustrations with evaluating AI vendors fell into three major areas:
Vendors promise too much: One respondent told us, “Many companies overstate the AI capabilities. The ideas are there and they may be starting down the road to development, but the reality is not there." In addition, we heard that many vendors show nice demos, but haven’t really dug into the actual use cases legal teams need. “With most companies, you really need a proof of concept to attempt to actually evaluate their product,” one respondent said. “Their usefulness in a demo or on a website just doesn't show how they would work for your use case.”
Every vendor sounds the same: If every vendor says they can do the same thing, how can anyone differentiate one from another? One survey respondent said, “The accuracy of AI is hard to define. Results vary dramatically based on prompt quality, document structure, data cleanliness, and user expertise. And, after a while, I get AI vendor merge where they all seem to offer the same software functions."
Verifying accuracy is difficult: Over a fifth of respondents mentioned this. Lawyers, rightly, are very worried about accuracy and hallucinations, and don’t want to do the manual work of checking and cross-checking AI. We heard from one senior counsel “Sometimes these products do not include the right information when trying to really narrow down a specific law or case. Sometimes I've found fake cases."
We know that attorneys are under pressure. In-house legal teams are leaner than ever and contract volumes are growing. Our customers tell us that increasingly, their leadership expects AI to be part of the solution. But that means that the stakes of adopting the wrong tool are very high.
The data reflects this tension. According to the ABA's 2024 Legal Technology Survey, 74.7% of attorneys identified accuracy as their top concern with AI implementation. And a Paragon Legal study found that over a third of legal professionals have relied on AI-generated outputs they don't fully trust.
Choosing the wrong AI legal tool isn’t just a waste of budget. In the worst-case scenario, it could introduce real legal risk. And when something goes wrong, and the person who championed the tool also has to explain the errors, the stakes become personal, not just professional.
No wonder lawyers are reacting strongly to a crowded AI legal tech market full of vendors making claims that may or may not be relevant to real-life use cases. The cost of failure is very high.
In a market where every vendor claims to have “AI,” here are the true differentiators:
The most prominent tools in this space are redefining what’s possible by collapsing implementation timelines, surfacing patterns across entire contract bases, and keeping playbooks automated and evergreen.
Quora’s approach to solving this problem was both comprehensive and well-suited to their particular needs. They identified seven criteria that were important to them as they considered how an AI tool would fit into their workflow: everything from the UI, to AI features, to customer support capabilities, to security.
Adrie Christensen, Legal Operations Lead at Quora, noted that the process involved defining clear success criteria with her general counsel, which they organized into a detailed scorecard for consistent vendor evaluation.
This evaluation framework gave them a good baseline to agree on what was important to them as a business. This gave them clarity and specificity about adopting technology to serve their needs and integrate into their existing ways of working.
For a starting point to develop your own framework, take a look at our whitepaper, The State of Legal AI: How to Futureproof your Tech Stack. It contains a simple decision-making infrastructure for you to use and customize when choosing AI solutions, as well as a checklist of what capabilities you should expect from your AI tool.
Schedule a demo today.
