Summary:
Many companies see little impact from AI tools because generic models don’t align with how teams actually work. In this piece, we offer two key concepts that can help bridge that gap. These ideas help explain why traditional AI often falls short and how organizations can design AI systems that adapt to real workflows, real users, and real context.
At a Fortune 500 retail company, the leadership provided a team responsible for drafting supplier negotiation contracts with an AI tool meant to streamline their work. Powered by a widely used large language model (LLM), the leadership expected that the tool would speed up the team’s work by summarizing documents, answering content questions, comparing contracts, and more.
Despite high expectations, however, the team’s output didn’t change. Although the tool could generate generic text—for example, a rough draft of the contract—the team had to then customize that text for each supplier. For each contract, they still had to manually incorporate critical details such as supplier information, terms, order history, and other nuances into the contract. As such, the tool had minimal impact on reducing the team’s workload.
This story reflects a pattern of how AI tools fail to live up to their promises. In a recent survey we conducted of 30 companies across industries (including the contracts team above), respondents reported that generic AI tools often fail to help users complete the specific tasks required in unique workflows because they are generic. Even when AI tools were tailored to specific domains such as finance or HR, they failed to add sufficient value because they still weren’t specialized enough—they didn’t translate to the specific needs and norms of an organization, team, or process. Hence, users reported that they regularly found that AI tools and models “did not work” or “were too generic.”
This gap between generic AI capabilities and the specific, evolving needs of teams points to a deeper challenge: Today’s tools aren’t built to understand how work actually gets done. In this piece, we offer two key concepts that can help bridge that gap: work graphs (digital maps of how teams work) and reverse mechanistic localization (tailoring AI models to teams). These ideas help explain why traditional AI often falls short and how organizations can design AI systems that adapt to real workflows, real users, and real context.
Why the AI Tool Wasn’t Creating Value
Consider this problem in the context of a particular process executed by the contracts team. Traditionally, when drafting a new contract for a supplier, the team would:
Log into multiple systems
Retrieve, analyze, and vet supplier details
Review quotes from the supplier and negotiation terms from similar (or related) suppliers
Examine any relevant order history from the supplier
Manually draft a contract incorporating all these elements
When the contracts team first adopted the AI tool, their process began with generating a basic, boilerplate contract. From there, they spent considerable time manually refining it. The AI tool’s output was based largely on publicly available content from the internet and lacked the nuanced, context-specific insights needed for their contracts. While the team found the AI tool interesting, they struggled to see real impact in terms of effort reduction.
This is an example of AI’s “productivity paradox” in action: incredible technology that, without deep contextual adaptation, fails to translate into tangible productivity improvements—a phenomenon reminiscent of economist Robert Solow’s observation that “you can see the computer age everywhere but in the productivity statistics.”
Powerful AI models excel because they are trained on vast, generic datasets—but their universality is a double-edged sword. While these models can perform a wide range of tasks, they often miss the unique context of specific workflows and team requirements, which leads to missed productivity gains.
Context Is Everything
The AI was failing for the same reason that a new hire might struggle to deliver right away: It didn’t know how the team got its work done, where to find information, or what—exactly—it was meant to do with it. It lacked the proper context to really contribute.
While the company was experimenting with this AI, it was also working on what would become a solution to that context problem by deploying tools to map its processes and create a “work graph”: a real-time, dynamic view of how teams execute workflows across systems. These maps captured more than just tasks—they revealed how decisions were made, what data was referenced, and which systems were involved. This was critical because, although two teams may use the same tools, how they work is often very different.
The data required to draft each contract wasn’t centralized or static, it varied by supplier and was scattered across multiple systems. Team members had to locate, interpret, verify, and synthesize this information to build accurate contracts. As they did this, their actions—navigating systems, reviewing data, making decisions—were automatically captured in aggregate in the work graph.
And here’s where the real opportunity emerged: The work graph—which included two months of activity that was vetted and context-rich—could then be used to train the AI tool. Because it captured everything the team deemed important, it provided the AI tool with real-time, human-validated context, which enabled the tool begin working in a way that aligned with how the team actually worked. With this input, the AI tool was able to produce a significantly more complete first draft, reducing iterations and accelerating the path to a final, usable contract. This approach cut the team’s manual effort in drafting each contract by more than half. While they still reviewed and verified the AI-generated output, they required far fewer iterations and much less rework. As a result, the team’s overall throughput in generating contracts increased by nearly 30%.
Making this work, however, required more than the right data—it demanded the right approach. While traditional automation often aims to replace human work, the goal here was to customize AI to work better for teams by reflecting how they operate. To do this, we applied an approached called reverse mechanistic localization (RML). To understand how an AI works, it’s common is to reverse engineer it from a human perspective. RML flips that idea on its head: It reverse-engineers how humans work—deeply analyzing a team’s real workflows, decisions, and context—and uses that to tailor the AI to better serve the team. It’s a model of collaboration, not substitution.
You can think of RML as doing for AI what customization does for software platforms. But while customizing software often means changing what the user sees, RML changes what the AI understands. It’s deeper, more contextual, and ultimately more powerful.
The following steps were involved in implementing RML for the contracts team.
1. Mapping the work graph.
We started by capturing each step and human-machine interaction that members of the contracts team took in doing their work: how they retrieved information about a supplier, how they performed cross-checks, how they manually integrated diverse sources of data into excel files. All of these steps capture both explicit actions (e.g., reading a supplier profile) and the tacit decision-making patterns that underpin the workflow (e.g., checking if the supplier has a poor credit rating). This granular, high-fidelity data is the essence of a team’s local context—the cues they rely on, which information they deem most important, and how they adapt in different scenarios.
This data, in all organizations, provides the granular specifics of how a team really works. When this rich tapestry is fed into the AI tool, it allows you to transform a generic model into a highly specialized tool that understands the local language of work. Hence, it’s important for leaders to think about investing in gathering this data and using it as a source for further transformations, including providing context to AI models.
2. Fine-tuning with the work graph.
Once we mapped the contracts team’s work graph, we used its detailed insights as context and use it to fine-tune the model powering the AI tool. That required feeding the model (beneath the AI tool) work patterns and data (e.g. supplier information). This is the key step—integrating the local context of a team into the AI tool. By integrating the specific work patterns and contextual cues from the team’s daily operations, the AI tool generates a first draft of the contract that includes several details of the supplier, the nuances around their credit ratings, etc. Hence, this is a more complete draft.
3. Continuous refinement.
Organizations continuously evolve. Processes change, new technologies are brought in, and situations and priorities shift (e.g., teams start working with new suppliers in new geographies). As such, companies need to continuously update the work graph and feed emerging patterns back into the model to keep AI tools up to date.
For instance, the contracts team periodically has feedback on the quality of the contracts generated using RML (e.g., the contract does not accurately reflect the implication of poor credit rating for a supplier). This kind of feedback, called reinforcement learning with human feedback (RLHF), is used to further fine-tune and refine the model in the AI tool. As a result, the AI tool continues to adapt to the team’s needs, ensuring sustained high accuracy over time.
Enterprise teams operate on tribal knowledge—implicit knowledge of how each team executes work and specific solutions to challenges encountered by the team when doing their work. By excavating this knowledge and fine-tuning models on this information, we can produce more accurate and contextualized models that serve teams more accurately.
An exciting use of AI models is to get them to act as “agents,” i.e., operate autonomously to execute work patterns. The challenge, however, is that these models are likely to suffer from the same universality problem since they are powered by powerful generalized models. For agents to be successful, they need to operate and execute precisely within a team’s context. Hence, RML is critical to power agents to learn from teams and hence serve teams more accurately.
What Can CXOs Do?
Generic models, while impressive in breadth, often fail to capture the nuanced local context that drives real efficiency and accuracy. CXOs must recognize that AI is not a “set it and forget it” technology. Instead, its value is unlocked when the system is aligned with the specific work patterns and decision-making processes of the organization.
By investing in a tailored approach, companies can significantly reduce error rates, cut operational costs, and ultimately achieve a much higher ROI from their AI initiatives. In today’s competitive landscape, neglecting to integrate this layer of contextual insight means leaving money—and strategic advantage—on the table.
In short, if your AI strategy relies solely on off-the-shelf solutions, you risk missing out on a transformation that drives true productivity and risk reduction. A complete AI strategy, therefore, must include continuous refinement through localized insights to ensure that technology investments deliver both immediate and long-term value.
Copyright 2025 Harvard Business School Publishing Corporation. Distributed by The New York Times Syndicate.
Topics
Technology Integration
Systems Awareness
Governance
Related
Exploring the Other Side of the BathtubHealth Information Technology LandscapeArtificial Intelligence in Healthcare: A New Frontier in Medical InnovationRecommended Reading
Operations and Policy
Exploring the Other Side of the Bathtub
Operations and Policy
Health Information Technology Landscape
Operations and Policy
Artificial Intelligence in Healthcare: A New Frontier in Medical Innovation
Strategy and Innovation
Managing the Chronically Late Patient
Strategy and Innovation
How to Unlock Value in Founder-Investor Partnerships
Strategy and Innovation
What the Like Button Can Teach Us About Innovation