Audit logs you'll actually trust

What to log, where to put it, and how to query it when the security team asks 'what did the agent do?'

Most agent systems have logging. Few have audit logs that hold up under questioning. This guide is about the latter.

The questions an audit log must answer

Imagine your security team walks into your office on a Tuesday afternoon and asks:

What did the agent do for tenant 42 between 09:00 and 12:00 today?
Has the agent ever read field X on model Y?
Did it write anything? With what justification?
Was anything denied by policy? Why?
Show me every operation associated with the user who reported the incident.

If you can’t answer these questions in under five minutes, your audit log is decorative. OrmAI generates audit logs designed to answer these questions in under five seconds.

What gets logged automatically

Every tool call OrmAI executes produces one audit row. The shape:

{
  "id": "01J9X4K7...",
  "ts": "2026-04-15T13:42:11.118Z",
  "principal": {"user_id": "agent-1", "role": "customer_chat", "agent_id": "asst_...", "session_id": "..."},
  "tenant_id": 42,
  "trace_id": "abc-123",
  "tool": "db.query",
  "model": "Order",
  "input_sanitized": {"where": {"status": "pending"}, "limit": 50},
  "policy_decision": {"allowed": true, "redacted_fields": ["customer.email"], "tenant_injected": true},
  "execution_ms": 12,
  "row_count": 3,
  "outcome": "success",
  "error": null
}

The same row shape works for reads, writes, denies, and timeouts. The outcome distinguishes them.

Where to store the log

Three good options. Pick the one that matches your operational maturity.

Option 1: A dedicated table in your application database

The simplest. OrmAI ships an SQLAlchemy model OrmaiAuditLog you can mount on your existing engine.

from ormai.audit import SqlAuditSink, OrmaiAuditLog

OrmaiAuditLog.__table__.create(engine, checkfirst=True)

policy = (
    PolicyBuilder(DEFAULT_PROD)
    # ...
    .audit_sink(SqlAuditSink(engine))
    .build()
)

Pros: zero infrastructure, queryable with SQL, transactional with the data being audited.

Cons: shares performance budget with your operational DB. Consider a separate audit DB once volume grows past ~1k events/sec.

Option 2: Structured logs to your existing logging pipeline

from ormai.audit import JsonLogSink
import logging

policy = ...audit_sink(JsonLogSink(logging.getLogger("ormai.audit")))

The audit row goes through the same pipeline as your application logs (Datadog, Splunk, Loki, CloudWatch). Pros: zero new infrastructure. Cons: log-based queries are slower; retention is set by your log policy, which may not match audit retention requirements.

Option 3: Dedicated event stream + warehouse

For larger systems, ship audit events to Kafka / Kinesis / Pub/Sub, land them in your warehouse (Snowflake / BigQuery), and query from there.

from ormai.audit import KafkaSink
policy = ...audit_sink(KafkaSink(brokers=..., topic="ormai-audit"))

Pros: survives your application DB outage; warehouse query speed; long retention is cheap. Cons: more moving parts.

We recommend Option 1 for small/medium deployments and Option 3 for anything compliance-heavy.

What “sanitized inputs” means

The input_sanitized field is the input to the tool after the policy compiler has stripped or masked it. So a query that included tenant_id: 7 (which the policy ignored) shows up as the policy’s view, not the agent’s view. There’s a separate input_raw_hash field — a hash of the raw input — so you can detect “the agent tried to forge tenant_id” patterns without storing the forged value.

This matters because audit logs are themselves data and shouldn’t leak more than necessary.

Trace correlation

OrmAI accepts a trace_id from your request context and writes it to every audit row. Wire it from your existing tracing layer:

from opentelemetry import trace

@app.post("/agent/tool")
async def call_tool(body: dict, ...):
    span = trace.get_current_span()
    trace_id = format(span.get_span_context().trace_id, "032x")
    ctx = RunContext.create(tenant_id=..., trace_id=trace_id, db=session)
    return await toolset.execute(body["name"], body["arguments"], ctx)

Now an audit row links to a trace, which links to your front-end request, which links to the LLM completion that triggered the tool call. End-to-end: “user asked X → LLM thought Y → agent called tool Z → policy decision W → result R.”

Querying the log

For Option 1, the basic queries:

-- Q1: everything for tenant 42 in a window
SELECT ts, tool, model, outcome, error
FROM ormai_audit_log
WHERE tenant_id = 42 AND ts BETWEEN '2026-04-15 09:00' AND '2026-04-15 12:00'
ORDER BY ts;

-- Q2: has the agent ever read field X on model Y?
SELECT count(*) FROM ormai_audit_log
WHERE tool = 'db.query'
  AND model = 'Customer'
  AND policy_decision->'redacted_fields' @> '["customer.email"]';

-- Q3: every write with reason
SELECT ts, principal, model, input_sanitized->>'reason' AS reason, row_count
FROM ormai_audit_log
WHERE tool IN ('db.create', 'db.update', 'db.delete') AND outcome = 'success';

-- Q4: every denial and the reason
SELECT ts, tool, model, policy_decision->>'reason'
FROM ormai_audit_log WHERE outcome = 'denied';

These are the queries SOC 2 audit evidence is built from. Save them as views.

Tamper resistance

Audit logs are evidence. Make them hard to alter.

Run the audit DB user with INSERT only — no UPDATE, no DELETE (or only DELETE for retention).
For high-stakes systems, periodically hash-chain the rows: each row includes the hash of the previous row. If anyone tampers, the chain breaks. OrmAI ships a HashChainSink that wraps any other sink.
Ship audit rows to an append-only store (S3 with object lock, etc.) for long-term retention.

Read the full article on tamper-resistant audit trails →

Retention

Set retention deliberately. We recommend:

Hot: 90 days online, indexed, sub-second queryable.
Warm: 1 year in cheaper storage, queryable in minutes.
Cold: 7 years (or whatever your industry requires) in object storage.

OrmAI 0.2 added a retention_policy(hot_days=90, warm_days=365) knob that the bundled retention worker honors. Configure once.

What not to log

Counter-intuitively: not the model output. The tool call result is the data; the agent’s natural-language output back to the user is a different log (your LLM-completion log, with whatever scrubbing you apply there). Mixing them creates a place where PII the agent has masked from itself ends up logged in cleartext via the response.

OrmAI logs the structured tool result with redaction applied. It does not log the surrounding LLM exchange. Your LLM-orchestration framework logs that, separately, with its own privacy controls.

Common mistakes

Logging only failures. Successful operations are 99% of the audit value.
Logging the raw tool input. That’s where forged tenant IDs and prompt-injected weirdness show up. Log the policy-sanitized input and a hash of the raw, separately.
Storing audit in the same table as your main data. Hot path contention will bite. Separate table or separate DB.
Forgetting to give your security team a view. They will love you for a saved query.