Skip to main content Skip to footer

Good intentions don’t scale. And neither does AI without strong information architecture.

Good data hygiene matters more now than it used to, yet today's workflows haven't caught up. After a hectic 10-hour workday, many organizations still rely on humans to not only assess legal work and provide highly critical legal judgement, but to also go back and classify their documents and input critical details into metadata fields. In reality, this just does not happen. Today, that disconnect between expectations and reality will prove to be the differential between AI slop and AI success. AI can only reason about a document if it understands what it actually is.

That understanding doesn't come from the text alone; it comes from the context around it, the description of what a file is, who it involves, and how it connects to everything else. When that context is not connected, the document still exists but AI is searching in the dark. Now imagine what happens when there are thousands of documents collected over decades with no context. It leaves you with years of organizational intelligence your AI tools simply cannot see clearly, regardless of how capable the tools become.

As a Product Manager at iManage, this is the challenge I spend my time on. Organizations have created a wealth of intelligence and organizational IP in their existing content. And we know tagging this content manually is simply not scalable so the gap between what they know and what their AI tools can make the most of effectively continues to widen. The opportunity in front of us is to give the AI tools you already have something solid to work from. It's the argument my colleagues and I made when we first took this work public at ConnectLive, and it's only become truer as AI has moved deeper into everyday workflows.

Why unstructured content quietly breaks AI

The cost of that scale problem is measurable. According to the iManage Knowledge Work Benchmark Report 2026, which surveyed more than 3,000 decision-makers across professional services firms in 26 countries, 86 percent of leaders say they're confident their AI tools are working and that their people can find the knowledge they need. Yet those same professionals spend an average of 37 minutes a day searching for documents. That gap between confidence and reality is where AI quietly underperforms.

This doesn’t happen overnight. It happens quietly and gradually. Context accumulates as a deficit, searches come up short, work gets duplicated, and institutional knowledge thins out. Who do we blame for this? Really, it’s no one’s fault. Good intentions just don't scale.

The way to fix it is to take the human bottleneck out of the equation. iManage AI Enrichment classifies and enriches content the moment it's saved, populating the document profile with metadata that's immediately usable in search, in Ask iManage, and across connected tools. The principle I keep coming back to is IA before AI: information architecture before artificial intelligence. Get that right, and everything else is built on solid ground.

What richer context looks like in practice

Consider a common request: find every agreement from the last five years governed by EU or UK law. Run that against an unenriched library and you might get 423 results: technically relevant, practically useless at that volume. Run the same request with enrichment in place and it resolves into structured filters for document class, party, date range, and jurisdiction, returning say, seven reviewed agreements. Same product, same query, a far shorter, less expensive and precise path to the right answer.

The point generalizes well beyond search. When content remains connected with context, agents can find what matters without being hand-fed a list, answers stay grounded in the firm's own knowledge rather than generic patterns, and governance applies automatically every time a document is saved.

The engineering choice behind the accuracy

My colleague Michael Eichsteadt, Vice President of Applied AI at iManage, explains the reasoning behind building the enrichment models small, fine-tuned, and trained in-house on legal text rather than reaching for large, hosted ones. On complex extraction tasks, our internal benchmarking against commercial test data puts large language models in the 80 percent accuracy range; our fine-tuned approach scores north of 90 percent on the same data. Models trained specifically on legal text are also less prone to prompt drift, so the same prompt returns the same result.

Michael also offers a test that any buyer can apply to any AI vendor, iManage included: ask how large the test set was, who labeled the ground truth, what the hardest entity type is, and where inference runs. "If they're not on the table, they're being suppressed," he says of standard accuracy metrics. It's a standard I'd hold us to as readily as anyone else.

That last question, where inference runs, drives a deliberate decision. All AI Enrichment models run inside iManage infrastructure with zero data egress, processing more than 3 million documents an hour across six global regions.

Built on customer feedback, not in a lab

Kristin McCoy, Senior Product Manager, describes how customer input has reshaped what we build. Firms told us party-rule accuracy wasn't good enough, so we rebuilt the extraction pipeline with legal fine-tuning, and it now scores 24 points ahead of a leading general model on that entity type. Firms told us they couldn't send client documents to third-party APIs, so we moved all inference in-house. "Great AI isn't built in a lab," Kristin says. "It's shaped by the people who use it every day." That's been true at every step of this product.

The enrichment pipeline keeps expanding the same way: from English-language agreements to new jurisdictions, and from agreements into litigation, each step shaped by the firms that use it. We never train our models on customer data; the validation programs simply test how the models perform against a firm's own document types and edge cases, and that's what tells us where to go next.

The order is the whole point. When AI falls short, the model is rarely the reason. So it lacks the context AI needs. Fix the foundation, and every tool you connect gets better at once. With iManage MCP Server, the open Model Context Protocol connection layer available to iManage Cloud customers, every AI tool you choose draws on that same governed, enriched knowledge. So, before you ask which AI to bring to your firm, get the ground ready first. That's the work that makes your investment in AI deliver real results, and it's where I'd tell any organization to start.