Insight · AI risk

Why Prompt Injection Is Not the Biggest Risk in Your AI Stack

Prompt Injection Is Well Understood

Prompt injection attacks involve malicious instructions embedded in inputs or retrieved content that cause an LLM to deviate from its intended behaviour. The attack is well documented, widely discussed, and the subject of extensive tooling investment. OWASP’s LLM Top 10 lists it first. Security vendors have built entire product lines around detecting and blocking it.

This attention is warranted. Prompt injection is a real risk. But the volume of attention it receives has created an asymmetry: enterprise security teams are investing heavily in the threat that is easiest to discuss while underinvesting in the threat that is hardest to see. Governing those agents end to end is the subject of our CISO guide to enterprise AI agents.

The Harder Problem: What the Agent Can Reach

Consider an internal AI assistant deployed for a financial services firm. It has access to the firm’s document management system, CRM, email, and internal knowledge base. A user with standard access permissions asks a question about client onboarding procedures.

The LLM processes the query legitimately. No prompt injection occurs. The guardrails pass. The output scanning finds nothing concerning. And yet the response includes non-public information about a specific client relationship that was retrieved from the CRM because the tool call that retrieved onboarding procedures was scoped too broadly.

No attack occurred. The model behaved exactly as designed. The exposure happened because the access boundary was wrong, and there was no system to enforce it at the retrieval layer.

This is the category of risk that most enterprise LLM security stacks cannot see. It does not trigger injection detectors. It does not produce anomalous outputs that scanning tools would flag. It is not a failure of the model. It is a failure of access governance.

The Three Retrieval Risks That Get the Least Attention

Data over-retrieval is the most common form. An agent with broad tool access retrieves more information than the task requires, and that information enters the context window where it can influence subsequent responses or be inadvertently surfaced to the user.

Context accumulation happens in long agentic sessions. An agent that makes dozens of tool calls across a session progressively builds up a picture of the organisation’s internal state that no single query would have produced. Individually, each retrieval looks legitimate. Cumulatively, the agent has effectively constructed a detailed internal operations manual from fragmented queries.

Cross-client contamination occurs in multi-tenant deployments or in systems where an agent serves multiple users. An agent with access to client A’s data that subsequently serves client B can carry context across session boundaries if session isolation is not enforced at the retrieval layer.

What Effective LLM Security Actually Requires

Prompt injection defences and output scanning remain necessary components of an enterprise LLM security stack. But they are not sufficient for organisations running agentic AI deployments with access to sensitive data systems.

Effective security at the access layer requires identity assignment for AI agents, so that each agent has a defined role with associated access permissions rather than inheriting the permissions of the user or system that spawned it. It requires sensitivity classification for data sources, so that access policies can be enforced based on the nature of the data being requested rather than just the identity of the requester. And it requires runtime monitoring at the retrieval layer, so that anomalous access patterns are detected and risk events are raised before exposure becomes a breach.

The organisations that will manage enterprise AI risk most effectively in 2026 are those that have moved their security thinking from the output layer to the access layer. The question is no longer only what the model says. It is what the model is allowed to see.

Frequently Asked Questions

Is prompt injection a real risk?

Yes. Prompt injection is a real and well-documented risk involving malicious instructions embedded in inputs or retrieved content that cause an LLM to deviate from its intended behaviour. OWASP's LLM Top 10 lists it first.

Why is prompt injection not the biggest risk?

The volume of attention prompt injection receives has created an asymmetry: enterprise security teams invest heavily in the threat that is easiest to discuss while underinvesting in the harder-to-see threat of what the agent can reach at the retrieval layer.

What are the three retrieval risks that get the least attention?

Data over-retrieval, where an agent retrieves more information than the task requires; context accumulation, where a long agentic session progressively builds up a picture of internal state from individually legitimate retrievals; and cross-client contamination, where context carries across session boundaries if session isolation is not enforced.

What does effective access-layer security require?

It requires identity assignment for AI agents so each has a defined role rather than inheriting user or system permissions, sensitivity classification for data sources so policies are enforced on the nature of the data, and runtime monitoring at the retrieval layer so anomalous access patterns are detected before exposure becomes a breach.

See how AI Uniti detects coordinated narratives 6 to 12 hours before traditional monitoring.