Why We Spent 20 Years Protecting Databases from Analytics (and Why AI Just Broke That Truce)

8:53am on a Monday

I want to take you back roughly twenty years.

It is 8:53am on a Monday morning in central London and I am sitting at my desk staring at the screen. My coffee is untouched. My palms are sweating.

The ETL job is still running.

If that phrase means nothing to you, it was enough in the mid-2000s to strike fear into the heart of any production DBA. At the time I was the database lead for a SaaS platform serving customers across the UK, most of whom would start logging in from 9am.

The weekend ETL process (Extract, Transform, Load) pulled data out of the operational database, reshaped it and pushed it into the enterprise data warehouse. It ran every weekend and it was supposed to finish long before Monday morning.

It had not.

CPU pinned. Redo logs churning. User sessions beginning to queue.

In a few minutes customers would start calling support. Support would start calling me. And somewhere upstairs, the CEO would notice that his dashboard was still showing last week’s numbers.

I had two options. Let it run and hope it finished before the system buckled under real user traffic. Or kill it and spend the next hour watching rollback potentially take even longer, while guaranteeing that the warehouse would be stale until the following weekend.

How did we end up building systems where that was a normal Monday morning dilemma?


Why We Built the Wall

For the better part of two decades, the industry answered that question in one consistent way: keep analytics away from operational systems.

Transactional databases were designed to process orders, update accounts and record events predictably. Analytical workloads were different. They were heavy, exploratory and often poorly constrained. They scanned large portions of data, built aggregates, joined everything to everything and consumed CPU and I/O in bursts that were difficult to forecast.

Putting the two together in the same system was a recipe for contention.

So we separated them.

We built ETL pipelines. We built data warehouses. Later, we built data lakes and lakehouses. We introduced replication, change data capture and streaming. Each innovation was, in its own way, an attempt to preserve the integrity of the system of record while still making data available for analysis.

This was not fashion. It was defensive architecture.

The separation protected revenue-generating systems from analytical curiosity. It provided workload isolation. It gave operations teams a fighting chance of keeping Monday morning uneventful.


It Was Never Just About Performance

Enterprise environments rarely have a single system of record. A CRM system holds customer interactions. An ordering platform tracks transactions. Billing lives somewhere else. Supply chain somewhere else again.

The warehouse became not just a safety valve for analytics, but a unifying layer. It was the place where disparate operational systems could be reconciled into something coherent.

For years, this model worked.

Dashboards were allowed to be slightly stale. Reports could reflect yesterday’s state. Humans tolerate delay. In fact, they often prefer it. Analysis takes time, and decisions are rarely made in milliseconds.

The truce held because the consumer was human.


AI Changes the Consumer

AI agents change that.

An AI agent does not log in at 9am. It does not wait for a dashboard refresh. It does not tolerate yesterday’s numbers if it is expected to act on what is true right now.

Inference is not reporting. It is decision execution at machine speed.

If an agent is recommending a next action, approving a transaction, adjusting a price or triggering a workflow, the freshness of the underlying data becomes materially important. Close enough is no longer good enough. Staleness is no longer cosmetic. It alters outcomes.

The architectural assumption that analytics can safely run on a delayed copy of operational data begins to fracture.

This does not mean warehouses were a mistake. It does not mean lakehouses are obsolete. It does not mean streaming pipelines were misguided.

It means they were optimised for a different consumer.

For years, we optimised for human analysis. Now we are increasingly optimising for machine-driven action.

That is a different problem.


The Balance of Trade-Offs

For two decades, the answer was clear: keep them apart.

Protect the system of record. Move the data. Analyse it somewhere else. Accept a little delay in exchange for stability and control.

That architecture was forged in moments exactly like that Monday morning at 8:53am, CPU pinned, redo logs churning, business users about to log in.

AI does not invalidate that history. It simply changes the balance of trade-offs.

The truce between operational databases and analytics was built for a world where humans consumed insight.

We are now entering a world where machines consume state.

And that changes the conversation.

AI Doesn’t Read Dashboards… and That Changes Everything for Databases

A bank executive opens a fraud dashboard in Microsoft Power BI.

Losses by region, chargeback ratios, transaction velocity trends and a heatmap of anomalous activity. The numbers refresh within minutes. Data flows out of the system of record, is reshaped and aggregated, then presented for interpretation.

This is contemporary analytics: fast and operationally impressive. But it remains interpretive. It explains what is happening, while intervention occurs elsewhere – inside a fraud model embedded in the execution path, deciding in milliseconds whether money moves or an account is frozen.

Reporting systems describe what has already occurred. Even when refreshed every few minutes, they are retrospective. Inference systems are anticipatory. They evaluate the present in order to shape what happens next.

For two decades, enterprise data platforms were built around a deliberate separation between systems of record and analytical platforms. The system of record handled revenue-generating transactions; analytics operated on copies refreshed hourly or even every few minutes. Latency narrowed, but the boundary remained.

AI systems do not consume summaries, however fresh. They make decisions inside the transaction itself. A real-time fraud model does not want a recently refreshed extract; it requires the authoritative state of the business at the moment of decision. When automation replaces interpretation, data freshness becomes a decision integrity requirement. That shift changes the role of the database entirely.


Snapshots ≠ State

The difference is not batch versus real time. It is snapshot versus canonical state.

A snapshot is a materialised representation of state at a prior point in time – even if that point is only moments earlier. It may be refreshed frequently or continuously streamed, but it remains a copy. The system of record contains the canonical state of the enterprise – balances, limits, flags and relationships – reflecting the legal and financial truth when a transaction commits.

In fraud detection, that distinction is decisive. A dashboard can tolerate slight delay because its purpose is explanation. A model embedded in the execution path cannot. It must evaluate current balance, velocity and account status, not a recently materialised representation.

For years, we increased the distance between analytics and the system of record to protect transactional stability. That separation reduced risk in a world where insight followed action.

Automation reverses that order. Insight now precedes action. Once decisions are automated, the gap between a copy of the data and the authoritative source becomes consequential.


When Almost Right Is Still Wrong

If a fraud dashboard is slightly stale, an analyst may adjust a threshold next week. When a fraud model evaluates incomplete or delayed state, the error is executed immediately and repeated at scale.

False declines can lock customers out within minutes. False approvals can leak substantial losses before discrepancies surface in reporting. Automation compresses time and amplifies mistakes because there is no interpretive buffer.

Real-time intervention is inevitable. Competitive pressure and regulatory scrutiny demand it. But once decisions are automated, tolerance for architectural distance shrinks. A delay harmless in reporting can be material in a decision stream. A dataset “close enough” for analytics may be insufficient for automated intervention.

The risk is not that dashboards are wrong; it is that forward-looking systems may act on something almost right.


Databases in the Age of Intervention

When fraud detection becomes automated intervention rather than retrospective analysis, the requirements on the data platform change. Freshness is defined at the moment of decision, not by refresh intervals.

Replication patterns take on new significance. Asynchronous copies and downstream materialisations were designed to protect the system of record. They optimise scale and isolation, but every layer introduces potential lag or divergence. For reporting, that trade-off is acceptable. For automated decisions in revenue-generating workflows, it becomes risk.

Workload separation also looks different. When analytics is retrospective, distance protects performance. When inference is embedded in operational workflows, proximity to live transactional state matters. The challenge is enabling safe, predictable access without compromising correctness.

Fraud detection is simply the clearest example. Dynamic pricing, credit approvals, supply chain routing and clinical triage all follow the same pattern. The model is not generating a report about what happened; it is evaluating the present to influence what happens next.

For decades, enterprise architecture assumed intelligence followed events. As AI systems become anticipatory and automated, intelligence precedes action. The database is no longer simply the foundation of record-keeping.

It becomes part of how the future is decided – whether we are comfortable with that or not.

Inferencing Is a Database Problem Disguised as an AI Problem

I have a habit of becoming interested in technology trends only once they collide with reality. Flash memory wasn’t interesting to me because it was new – it was interesting because it broke long-held assumptions about how databases behaved under load.

Cloud computing wasn’t interesting to me because infrastructure became someone else’s problem. It became interesting when database owners started making uncomfortable compromises just to get revenue-affecting systems to run acceptably in the cloud. Compute was routinely overprovisioned to compensate for storage performance, leading to large bills for resources that were mostly idle. At the same time, “modernisation” began to feel less like an architectural necessity and more like a convenient justification for expensive consultancy services.

And now, just when I thought flashdba had nothing left to say, AI is following the same path.


We’ve Seen This Movie Before

For the last couple of years, most of the attention has been on training. Bigger models, more parameters, more GPUs, massive share prices. That focus made sense because training is visible, centralised and easy to reason about in isolation. But as inferencing starts to move up into the enterprise, something changes.

In the enterprise, inferencing stops being an interesting AI capability and starts becoming part of real business workflows. It gets embedded into customer interactions, operational decisions and automated processes that run continuously, not just when someone pastes a prompt into a chat window. At that point, the constraints change dramatically.

Enterprise inferencing is no longer about what a model knows. It is about what the business knows right now. And that is where things begin to feel very familiar to anyone responsible for systems of record.

Because once inferencing depends on real-time access to authoritative operational data, the centre of gravity shifts away from models and back towards databases. Latency matters. Consistency matters. Concurrency matters. Security boundaries matter. Above all, correctness matters.

This is the point at which inferencing stops looking like an AI problem and starts looking like what it actually is: a database problem, wearing an AI costume.


Inferencing Changes Once It Becomes Operational

While inferencing remains something that sits at the edge of the enterprise, its demands are relatively modest: a delayed response is tolerable… slightly stale data is acceptable. If an answer is occasionally wrong, the consequences are usually limited to a poor user experience rather than a failed business process.

That changes quickly once inferencing becomes operational. When it is embedded directly into business workflows, inferencing is no longer advisory… it becomes participatory. It influences decisions, triggers actions and – increasingly – operates in the same execution path as the systems of record themselves. At that point, inferencing stops consuming convenient snapshots of data and starts demanding access to live context data.

What is Live Context?

By live context, I don’t mean training data, feature stores or yesterday’s replica. I mean current, authoritative operational data, accessed at the point a decision is being made. Data that reflects what is happening in the business right now, not what was true at some earlier point in time. This context is usually scoped to a specific customer, transaction or event and must be retrieved under the same consistency, security and governance constraints as the underlying system of record. In other words, a relational database. Your relational database.

Live Context gravitates towards RDBMS systems of record. It does not appear spontaneously – it is created at the moment a business state changes: when an order is placed, a payment is authorised, an entitlement is updated or a limit is breached, that change becomes real only when the transaction is committed to the RDBMS. Until then, it is provisional.

Analytical platforms can consume that state later, but they do not create it. Feature stores, caches and replicas can approximate it, but they do so after the fact. The only place where the current state of the business definitively exists is inside the operational production databases that process and commit transactions.

As inferencing becomes dependent on live context, it is therefore pulled towards those databases. Not because they are designed for AI workloads, and certainly not because this is desirable, but because they are the source of truth. If an inference is expected to reflect what is true right now, it must, in some form, depend on the same data paths that make the business run.

This is where the tension becomes unavoidable.


Inferencing Is Now A Database Problem

Once inferencing becomes dependent on live context, it inherits the constraints of the systems that provide that context. Performance, concurrency, availability, security and correctness are no longer secondary considerations. They become defining characteristics of whether inferencing can be trusted to operate inside business-critical workflows at all.

This is why enterprise AI initiatives are unlikely to succeed or fail based on model accuracy alone. They will succeed or fail based on how well inferencing workloads coexist with production databases that were never designed, built or costed with AI in mind. At that point, inferencing stops being an AI problem to be delegated elsewhere and becomes a database concern that must be understood, designed for and owned accordingly.