The agent–data security gap nobody is talking about

Most AI safety attention is on the model. The next class of incidents will be at the model–database boundary, and the industry is unprepared.

Most of the visible AI safety work is upstream of the model: prompt filtering, output classification, jailbreak detection. These are real categories. They miss the place where the next set of incidents will happen.

The next class of agent incidents will not be the model saying something embarrassing. It will be the model acting on something it shouldn’t, against your database. And the industry is structurally unprepared.

Why upstream tooling doesn’t help

A jailbreak detector watches for prompts that try to manipulate model behavior. An output classifier watches the model’s natural-language response for unsafe content. Both treat the model as a function from text to text.

A production agent is not a function from text to text. It’s an actor with side effects: tool calls that read and write your data. The tool call is where harm happens. The natural-language wrapper around it is downstream.

When an agent leaks another tenant’s invoices, the leak is in the tool result, not in the model’s prose. The classifier passes the prose. The user reads the prose. The leak is real.

Why “give the agent a Postgres role” doesn’t cover it

The standard infra answer to “constrain what the agent can do” is “give it a least-privilege database role.” This is necessary; it isn’t sufficient.

A read-only role can leak across tenants. A read-only role can scan an entire table. A read-only role can read PII. Database privileges are about which operations are allowed, not which rows, which columns, or which combinations are allowed for a given context. Row-level security gets closer, but RLS is hard to configure correctly across hundreds of tables and tens of policies.

You need policy that lives at the application layer, where it can see request context (which tenant is acting, with what role, on what behalf), apply field-level visibility, and enforce per-call budgets.

Why “it’s just function calling” doesn’t cover it

The other common response: “We use function calling, not raw SQL. The model only sees the tools we wrote.” Better. Still incomplete.

Hand-rolled function tools have all the policy gaps we describe in the comparison: tenant scoping is re-derived per tool, redaction drifts, audit logs diverge. The function-calling interface is a syntactic constraint, not a policy. The first time you have ten tools and want to add a horizontal rule (“never expose customer.email to agents below role X”), you’re touching ten files.

A policy layer is the abstraction that makes those rules declarative.

Why the existing security industry isn’t filling the gap

A few reasons.

1. The vocabulary is borrowed from the wrong field

Most “AI safety” vendors come from the content moderation world. Their tools speak in terms of “harmful content,” “PII detection,” “red-teaming.” These are the right tools for the right problems. The agent–data problem is structural authorization. Different field, different tools.

2. The incidents aren’t yet visible enough

When a chatbot gets jailbroken into saying something embarrassing, it ends up on Twitter. When a chatbot returns rows from another tenant, the user doesn’t know it happened. By the time the security team finds it (if they ever do), it’s three months later, in a SOC 2 audit. There’s no public corpus of “agent leaked data” incidents the way there is of “agent said something bad.”

This will change. Either through enforcement (regulators noticing), through publicized incidents (a chatbot leaks something a user does recognize), or through procurement filters (enterprises requiring evidence of agent–data controls).

3. The infrastructure layer hasn’t shipped yet

There are now mature primitives for prompt filtering, RAG, evals, observability. The primitives for agent authorization are less developed. OrmAI is one. There will be more. The lack of mature tooling is itself a sign that this is the next gap.

What enterprises will start asking for

Procurement at large enterprises is already asking the second-order questions. Patterns we’ve seen in the last six months:

“Can you produce, on request, a list of every database operation your AI features performed for our tenant in the last 90 days?”
“What controls prevent the AI from accessing fields outside the policy you’ve described?”
“If a researcher reports a prompt-injection attack, what is your detection-and-response timeline?”
“Provide an SBOM for your AI stack including the model, the agent framework, and the data-access layer.”

If you can’t answer these for your agent the way you can for your application code, you have a procurement-stage problem coming.

What “good” looks like

A useful litmus test for any agent–data layer:

Can a non-engineer security reviewer understand the policy? If the answer is “you need to read the source,” the abstraction is wrong.
Is there a single artifact that says “this is what the agent can do”? Not a wiki page; an executable file.
Can the audit log answer “show me everything the agent did for user X” in one query? Not “we’d have to run a script.”
Can you change the rules in one place and have it propagate? Or do you have to touch every tool definition?

These are the same tests we apply to authorization in normal applications. They’ve been the standard for two decades. The agent layer should hold the same bar.

Where to start

For teams building their first agent that touches a real database:

Start with the Why page to map the threat surface.
Build a minimal policy file before you build the first tool. Enforce it from day one.
Pick one of: OrmAI, a similar policy layer, or a deliberately built in-house equivalent. Don’t ship without one.
Wire the audit log into your existing observability before launch. Not after.
Run the Spider-style probe on your own surface. You’ll find leaks.

For teams already in production: the production checklist is the audit you should run on yourself this quarter.

The agent–data layer is going to be the next big audit category. The teams that get ahead of it will have a far easier sales conversation a year from now.