Capability tokens for AI: a primer

An old idea from operating systems is becoming load-bearing for agent security. Here's what capability tokens are, and why your agent toolkit should think in them.

In the 1970s, a small group of operating-system researchers proposed a model of access control called capability-based security. It largely lost to access-control lists (ACLs) in mainstream OS design, but the idea kept popping up in distributed systems, language runtimes, and crypto. In 2026, it is becoming useful again — this time for AI agents.

This primer explains what capability tokens are, why they fit the agent problem, and what to look for in any agent toolkit that claims to use them (including OrmAI’s, which does).

What a capability token is

A capability is an unforgeable token that grants a specific authority to its holder. Holding the token is sufficient to do the action; not holding it makes the action impossible.

Compare to ACLs (the dominant model): in an ACL world, the system asks “is this principal on the list for this resource?” In a capability world, the system asks “did the caller present a token granting this action on this resource?”

The difference is small in code and large in design philosophy:

ACLs are tied to identity. Authority is checked at the time of action.
Capabilities can be passed, narrowed, attenuated, and revoked. Authority is bundled with the token.

For AI agents, this matters because the agent’s “identity” is fuzzy. The agent acts on behalf of a user, in a session, in a context, with a specific goal. Trying to encode all of that into an ACL principal is awkward. Encoding it as a capability token the agent holds for the duration of the call is natural.

Why this fits the agent problem

A typical agent tool call goes through these layers:

The user asks the LLM something.
The LLM decides to call a tool.
The application sends the tool call (with arguments) to a backend handler.
The handler authorizes, executes, and returns.

In a capability model, step 3 is where the token enters. The handler creates a token that says “this caller may, in this session, perform this operation, against this set of resources, with these constraints.” The handler then attempts the call, presenting the token to whatever resource it’s accessing.

Here’s the key property: the token is constructed outside the model’s view. The model can’t forge it, narrow it incorrectly, or extend it. It just causes calls to be made, and the application wraps each call in the right token before execution.

OrmAI’s RunContext is, conceptually, a capability token. It carries:

The principal (user ID, role).
The tenant scope.
The trace ID.
A reference to the database session.
A snapshot of the policy, with budgets attached.

When the agent calls a tool, the tool runs against the RunContext. Anything not in the token, the tool can’t do. The token can’t be widened from inside the call.

Attenuation

A defining property of capabilities is that they can be attenuated — narrowed at any point by anyone holding them.

Imagine your agent has a RunContext for tenant 42 with full read access. It needs to call a sub-tool that should only see one customer. You attenuate:

narrow = ctx.narrow_to(customer_id=7)
sub_tool.execute(args, narrow)

narrow is a new token, derived from ctx, that grants strictly less. The sub-tool can do exactly what the parent could, restricted further to customer 7. If the sub-tool tries to widen back, it can’t — the token doesn’t carry that authority.

This is the right shape for agent sub-loops, multi-step workflows, and tool composition. Each step holds exactly the token it needs; mistakes in one step can’t compound into wider authority.

Revocation

In a pure capability model, revocation is hard — once a token is out, you can’t recall it. In practical systems, you build revocation in: tokens have short lifetimes, holders re-validate periodically, or there’s a revocation list.

OrmAI handles revocation via the audit + budget store: every active session is tracked, and an admin tool can mark a session as revoked, after which any further tool calls with that RunContext are denied. It’s not pure-capability theology, but it works.

Why this is more than a vocabulary trick

You could implement everything described here with ACLs. The reason capabilities help is that they push the authorization decision to the token-construction site, which is closer to the request, the user, and the policy. ACLs push the decision to the resource, which is closer to the data and farther from intent.

For agents, the intent (which user, which session, which tool, with what constraints) is exactly what we need to record and enforce. Capability tokens are the data structure for that intent.

It’s also a vocabulary that scales nicely for thinking about composed agent systems:

Tool composition. When tool A calls tool B, B receives an attenuated capability of A’s.
Cross-service. When your agent calls another team’s service, you pass a capability that grants exactly the right thing — narrower than your own.
Time-bound. Tokens carry expiry. Long-lived sessions get refreshed; short calls don’t.
Auditable. Every capability use is a log entry. Provenance is built in.

What to look for in an agent toolkit

If you’re evaluating OrmAI or any other agent–data layer through this lens, ask:

Is there a single object that represents the authority of a call? Not five flags scattered across the call path.
Can you narrow it without re-implementing policy? Should be one method.
Is forgery structurally impossible? The agent should not be able to construct a token; the application should.
Does every tool call require one? Or can someone slip a tool call through without a context?
Does the audit log record the token used? So you can answer “what authority was held when X happened?”

OrmAI passes all five. Other libraries pass some. The pattern is becoming a basic engineering test for agent infrastructure.