AI Agents Don’t Just Add Load. They Change Its Shape

Every capacity model ever built for an enterprise database rests on a quiet assumption.

It was never written down. It was never debated in architecture reviews. Nobody decided to rely on it. It was simply always true.

Users are human. And humans are slow.

Not slow in a pejorative sense. Slow in the precise, measurable sense that matters to systems design. A human reads a screen. A human considers a result. A human decides what to ask next. Between every request and the next one, there is a gap – sometimes seconds, sometimes minutes – in which the database does nothing on that user’s behalf.

That gap has a name in queueing theory: think time. And for the entire history of enterprise computing, think time was not a performance metric anyone optimised. It was simply part of the background.

It turns out that assumption was doing an enormous amount of work.

What Think Time Was Actually Doing

Think time was not merely latency between requests.

It was implicit backpressure. It was distributed flow control. It was, in effect, the admission control mechanism that enterprise databases have always depended on – the only one that required no configuration, no tuning and no operational overhead, because it was provided automatically by the humans using the system.

Those gaps are why connection pools drain and refill rather than exhaust continuously. They are why buffer caches hold useful state between requests. They are why redo logs catch up. They are, in short, why enterprise databases behave predictably under load.

Remove think time, and you do not get the same system running faster. You get a system whose stability model no longer describes the workload it is receiving.

AI agents do not have think time. And this is the foundational problem – not a configuration issue, not a capacity shortfall, but a structural mismatch between the assumptions encoded in every database ever built and the behaviour of the workloads now arriving.

Three Ways Agents Violate the Assumptions

The shift is not simply a matter of agents running faster than humans. It is a change in the shape of the workload that violates the design assumptions of enterprise databases in three distinct ways. The first is already familiar to anyone who has read earlier articles in this series. The second and third are where the genuinely novel risk resides.

Compression

When an agent handles a task, think time collapses toward zero. The arrival rate assumptions baked into every capacity model – the peaks, the breathing room, the quiet periods that allow connection pools to recover and buffer caches to stabilise – are calibrated for human behaviour. They are wrong for agents.

This part is the easiest to see. This is not new ground. Earlier articles in this series have covered the amplification effects that result when agents remove the natural pacing humans impose on production systems. The point here is simply that compression is the first and most visible way agents change load shape – and it is the least of the three problems.

Enterprise databases were designed for demand with gaps. Agents remove the gaps.

Expansion

A human performing a business task issues a bounded number of database requests. The queries are written by a developer, tested, tuned and deployed. One intent produces a known, finite set of database operations.

An AI agent performing the same task does not execute a predefined sequence. It reasons. The database is not answering a question – it is a node in a search tree the agent is traversing. The agent retrieves data, evaluates what it found, decides what additional context it needs and queries again. Each result redefines the next query. The scope of the interaction is determined dynamically, by the reasoning process itself, with no upper bound established in advance.

The result is fan-out: a single high-level task producing a cascade of database interactions that no query governor was configured for and no capacity model includes.

Recursion and Amplification

The third violation is where the consequences are most severe, and where the existing body of database operational knowledge offers the least guidance.

Enterprise databases are built around an assumption that transactions are independent. A transaction arrives, executes, commits or rolls back and leaves. This is not a minor implementation detail. Transaction independence is the foundational assumption on which locking strategies, concurrency controls, resource scheduling and every capacity model ever written for these systems depends.

AI agents with feedback loops break this assumption.

Anyone who has watched a system spiral under retry pressure will recognise the shape of what follows.

An agent that queries a result and then uses that result to determine its next query creates a dependency chain between transactions that the database has no way to model. The database is no longer serving requests from an external actor. It is participating in a feedback loop it does not control, cannot observe and was never designed to influence.

An orchestration framework can spawn sub-agents in response to intermediate results. The resulting transaction volume bears no relationship to the original request.

An agent that retries a failed step, reformulates the question and tries again does not produce the kind of error pattern that timeouts and governors were built to handle. It produces a closed-loop system whose behaviour under stress is the opposite of what open-loop capacity models predict.

Under human load, adding capacity stabilises a system. Under agent-generated feedback loops, adding capacity can accelerate the instability – more concurrency means more simultaneous feedback cycles, which means harder spikes when those loops interact.

The conventional response to a system under stress makes the problem worse, not better. In the limit, it does not stabilise the system. It accelerates its failure.

When Capacity Planning Stops Working

This is the point where capacity planning stops working as a discipline.

If you are responsible for keeping these systems stable, this is the point where the usual playbook stops helping.

The problem is not that the standard responses – add capacity, tune queries, raise connection limits – are wrong in isolation. It is that they were designed for a different kind of system. They treat load as a volume problem. The pressures described here are shape problems.

Connection pool exhaustion under agentic load does not reflect a sizing error. It reflects the collapse of the arrival rate assumptions the pool was built on. Query cost explosions do not reflect insufficient compute. They reflect fan-out patterns that no query governor was asked to anticipate. Contention under agent feedback loops does not reflect a locking configuration problem. It reflects the violation of transaction independence – the assumption every concurrency control in the system was written against.

These are not operational incidents. They are the system reporting that its model of its own workload is no longer valid. It is not just that the model is inaccurate. It is that it no longer corresponds to reality in any meaningful way.

Adding capacity treats the symptom. It does not fix the diagnosis.

Where This Goes Next

The pressures described here are already visible in organisations that have moved beyond early AI demonstrations into production deployments. They are not the future. They are arriving now.

They are arriving first at the SaaS and application layer – the middleware that sits between AI agents and the systems of record. That layer is absorbing some of the initial shock. It provides insulation: rate limiting, query mediation and caching. It imposes the pacing that agents themselves do not have.

But middleware does not eliminate these pressures. It defers them.

Agents are moving toward the systems of record directly – removing dashboards, APIs and every middleware layer that once imposed human-paced demand on systems built for human-paced demand. When the insulation goes, what remains is machine-paced reasoning operating against infrastructure whose stability model was written for a different kind of user.

When that insulation is gone, the assumptions examined in this article will no longer be theoretical. They will be production incidents.

This article is part of the Databases in the Age of AI series.