The 90-Day Agentic Readiness Sprint (and Why Your SI Can't Do It)

Why your SI can't move at the speed this moment demands.

Let me describe a scenario that’s playing out right now in manufacturing enterprises across the globe.

The board has seen the Gartner projections. 40% of enterprise applications embedding AI agents by end of 2026. The CEO comes back from a conference where every keynote featured autonomous procurement agents, predictive maintenance systems that self-heal, and quality agents that catch defects before they propagate. The mandate lands on the CIO’s desk: show agentic AI progress by Q3.

The CIO calls their SI — the firm that’s handled their last three digital transformation programmes. The SI sends a team. Discovery takes six weeks. Scoping takes four more. The statement of work goes through procurement, legal review, MSA negotiations. Twelve weeks in, and no one has looked at the actual data.

When the engagement finally starts, the SI staffs it from a bench of generalists who’ve done ERP implementations but haven’t spent time inside MRO data catalogs or manufacturing event architectures. They produce an assessment: 200 slides, a maturity matrix, a roadmap with 18-month timelines, and a recommendation to “establish an AI Center of Excellence.”

The CIO now has a document. What they don’t have is any running code, any proof that their data can support an agent, or any clarity on which of their 12 production systems is the actual bottleneck.

Sixteen weeks gone. No working proof point. And the board wants progress by Q3.

I’ve watched this pattern repeat enough times to know it’s not a people problem — it’s a structural one. The SI model was designed for large, multi-year transformation programmes. Agentic readiness is a different kind of problem. It’s bounded, it’s diagnostic, and it rewards speed and domain depth over scale and process.

Why the traditional model stalls

Three structural characteristics of the SI delivery model create friction with the pace agentic readiness demands.

The staffing model is built for scale, not depth. Large SIs staff engagements from resource pools optimised for utilisation rates. The consultant who shows up to assess your SAP master data quality may have done three SAP implementations — but they may never have spent time inside a manufacturing parts catalog understanding why “BRG HSG ASSY 6205-2RS” and “BEARING HOUSING, DEEP GROOVE, 25MM” are the same physical object. Domain depth in data governance isn’t something you can staff from a bench. It’s earned over years of working inside specific data environments. When the assessment team lacks this depth, they produce findings that are technically correct but operationally useless — “data quality needs improvement” without the specificity to tell you which material groups have the highest duplication rates, which vendor master records are creating cross-reference failures, or which classification gaps will break an agent’s ability to reason about part interchangeability.

The delivery timeline is gated by process, not complexity. The actual diagnostic work — assessing API readiness, evaluating event architecture maturity, profiling master data quality — can be scoped and executed in weeks. What stretches it to months is the SI’s internal machinery: engagement governance, risk review, staffing allocation, status cadences, steering committees. Each of these exists for good reason in a multi-year programme. In a bounded diagnostic engagement, they’re overhead that consumes the calendar without advancing the answer.

The deliverable is a document, not a proof. The traditional SI assessment produces a report. Findings, recommendations, a roadmap, a business case. What it doesn’t produce is evidence that the recommended architecture actually works against the client’s real data. A proof of concept — a working agent that queries governed data through a properly built integration layer and returns a correct result — answers more questions in 30 minutes of demonstration than 200 slides of recommendations.

None of this is a criticism of the people inside SIs. Many are excellent engineers and consultants. The constraint is the model — the way engagements are structured, staffed, priced, and governed. That model works well for what it was designed for. It wasn’t designed for this.

What a 90-day sprint looks like

The alternative is a bounded, engineering-led engagement designed to answer one question: is your enterprise ready for agentic AI, and what specifically needs to change?

Here’s how we think about it, structured across the three layers from the previous articles.

Weeks 1-3: The diagnostic.

This is assessment, but with a critical difference — it’s data-first, not interview-first. Instead of spending three weeks in stakeholder workshops mapping processes on whiteboards, you spend them inside the actual systems.

Profile the master data. How many active material master records? What’s the duplication rate? Which material groups have the most inconsistency? Where are the cross-reference gaps between material, vendor, and equipment masters? What classification standards are in use, and how consistently are they applied?

Map the integration surface. Which systems does an agent workflow need to touch? What are the real API characteristics of each — not the documentation, the production behaviour? What’s the actual latency under load? Where are the batch-only systems with no real-time interface?

Assess the event architecture. What business events are currently captured? In what format? Where do they live? What’s queryable, and what’s buried in unstructured application logs? What’s the gap between what exists and what an agent would need to reason effectively?

At the end of three weeks, you have a diagnostic that’s grounded in data, not opinion. Not “data quality needs improvement” — but “your pump components material group has a 34% duplication rate, your valve catalog has inconsistent UOM entries affecting 12% of records, and your vendor master has 847 active vendors with no material cross-references.”

Weeks 4-8: The architecture.

Based on the diagnostic, design the target state for each layer. This isn’t a theoretical architecture deck — it’s a solution architecture specific to the client’s systems, data landscape, and agent use cases.

For Layer 1: design the integration surface. Which systems get event bridges? Where does the typed integration layer sit? What’s the authentication and error-handling pattern? How do you handle the systems that are batch-only — do you build a polling adapter, a change-data-capture stream, or a lightweight middleware?

For Layer 2: design the event schema and the two-tier storage architecture. What domains get captured first? What’s the event structure — the intent/context/outcome pattern from Article 2? Where does the operational store live, and how does it stream to the analytical layer? What’s the retention and query pattern for each tier?

For Layer 3: design the governance pipeline. Where does the governance layer sit between requestors and the ERP? What’s the duplicate detection logic for the highest-impact material groups? What classification dictionaries are needed? Where does AI-assisted classification add value, and where does it need human-in-the-loop validation?

The output is a solution architecture document that an engineering team can build from. Not a 200-slide recommendation deck — a technical specification with data models, API contracts, event schemas, and governance rules.

Weeks 9-12: The proof of concept.

This is the part the SI model almost never reaches within the first engagement. Build a working PoC that demonstrates the architecture against real data.

Pick one agent use case — procurement is usually the best starting point because it touches all three layers (API to ERP for stock checks and PO creation, event stream for purchase history and vendor performance, governed master data for part identification and vendor cross-referencing). Build a thin slice: the integration layer for two or three systems, a starter event store with three months of historical data, and a governance pipeline for one material group.

Then demonstrate it. Show the agent querying governed data through the integration layer. Show it reasoning against structured business events. Show it producing a correct procurement recommendation that a human operator would agree with.

A working PoC changes the conversation. The CIO goes to the board not with a roadmap and a maturity matrix, but with a live demonstration of an agent reasoning correctly against their own data. That’s the difference between “we’re planning for AI” and “we’re building it.”

The operating model behind the sprint

The sprint works because of three structural choices that differ from the SI model.

Small team, deep domain. A sprint team is 3-5 people, not 15-20. But each person brings specific depth: someone who’s spent years inside ERP data environments and knows what manufacturing master data actually looks like in production. Someone who’s built event architectures and understands the operational/analytical tier split. Someone who can wire up an integration layer across SAP OData, Maximo REST, and whatever else the client runs. You don’t need a programme manager, a change management lead, and a steering committee for a 12-week bounded engagement. You need engineers who know the domain.

Engineering-led, not process-led. The diagnostic is code-driven: profiling scripts against real data, API probes against production endpoints, event architecture assessment against actual log stores. The architecture is built to be built from — not presented and shelved. The PoC is running code, not a simulation. Every week produces artifacts that advance the technical answer, not artifacts that advance the engagement governance.

The client’s data, from day one. No synthetic datasets. No “representative samples.” The sprint works against the client’s actual production data — anonymised where necessary, but real in structure, scale, and messiness. This is what reveals the true state of readiness, and it’s what makes the PoC credible. An agent that works against sanitised test data proves nothing. An agent that works against production data with 34% material group duplication proves the architecture can handle reality.

What this means for the enterprise

The agentic AI moment is real, but the window for thoughtful preparation is narrower than most enterprises realise. The organisations that get the data infrastructure right in 2026 will be the ones deploying agents at scale in 2027. The ones that spend 2026 in assessment cycles and steering committees will still be producing roadmaps while their competitors are producing results.

The $2.52 trillion in global AI spending that Gartner projects for 2026 has to go somewhere. Right now, most of it is going into models, platforms, and compute. The enterprises that redirect a meaningful fraction toward data infrastructure — governed master data, structured event streams, modernised integration surfaces — will be the ones that convert that spending into operational advantage.

The sprint model isn’t the only way to get there. But the principles behind it apply regardless of who does the work: start with the data, not the model. Diagnose with code, not workshops. Prove with working systems, not slide decks. And move at the speed the moment demands.

The plumbing isn’t glamorous. It never is. But it’s the difference between an agent that reasons and an agent that hallucinates against your own data.

Fix the plumbing. Deploy the agent. See what happens.

About the Author

Raghu Vishwanath is Managing Partner at Bluemind Solutions, a product engineering firm specializing in MRO master data governance. He writes about software engineering, AI, and building platforms that last.