OrmAI

Guide

Production checklist for agent + database systems

The 30 things to verify before letting your agent talk to a real database. Compiled from incidents, audits, and three years of shipping.

Dipankar Sarkar · ·Updated April 15, 2026 productionchecklistoperationssoc2

If you’re about to ship an agent that touches your production database, run through this list. Each item came from a real incident or a real audit finding, in our work or a customer’s.

Identity & isolation

  • Tenant ID comes from the authenticated session, not the agent’s input. Verified by code review.
  • tenant_scope() is set in the policy. Verified by inspecting the policy file.
  • Cross-tenant test exists in CI. Asserts that an agent acting on tenant A cannot see tenant B’s data, even when forging tenant_id in the where clause.
  • Multi-tenant join targets are denormalized with their own tenant_id. Or you have an explicit theory of why they don’t need it.
  • Database row-level security is enabled on the highest-stakes tables (compliance-heavy data).
  • Admin / cross-tenant operations live behind a separate policy and a separate audit channel.

Reads & redaction

  • Wildcards for secrets (*password*, *secret*, *token*, *api_key*) are in deny_fields.
  • PII columns are masked, not denied — unless you have a specific reason to deny.
  • Free-text fields that may contain PII are denied or wrapped in a domain tool with sanitization.
  • describe_schema() returns only what policy allows. Verified by manual call.
  • No tool returns raw passwords / tokens / secrets in any code path. Verified by grep + test.
  • Joined responses respect redaction. Test with include parameter.

Writes & approvals

  • require_reason=True on every write-enabled model.
  • writable_fields constrains updates to the actual fields the agent should be able to mutate.
  • Approval gates exist for high-stakes writes (price changes, role assignments, large refunds, etc.).
  • Approval queue has a documented SLA and an auto-deny on timeout.
  • Approval identity is logged with the write.
  • Dry-run is exposed as a separate tool for any write that affects > 10 rows.
  • max_writes_per_minute is set conservatively (≤ 20 unless you have measured otherwise).

Budgets

  • max_scan_rows is set. This is the single most important budget.
  • statement_timeout_ms is set — both at the DB session level and via OrmAI policy.
  • max_rows is set per tool.
  • max_join_depth is set (≤ 3).
  • Per-tenant quotas exist if you serve multiple tenants from one process.
  • Budget store is Redis-backed if you run more than one app instance.
  • Budget exceeded errors return a structured suggestion the agent can recover from.

Audit

  • Audit sink is configured (SQL, JSON log, or event stream).
  • Audit DB user has INSERT-only privileges (and DELETE only for retention).
  • Audit rows include trace IDs linked to your observability stack.
  • Hash-chain or append-only retention for the long-term audit copy.
  • Saved queries are documented for the security team’s most common questions.
  • Retention policy is set deliberately (90 days hot / 1 year warm / 7 years cold by default).
  • Sanitized inputs are logged, with a hash of raw inputs separately for forgery detection.

Operational

  • DEFAULT_PROD policy is the base for production policies, not DEFAULT_DEV.
  • Policy lives in version control with PR review.
  • Policy has a regression test suite that asserts which calls succeed and which fail.
  • Health check endpoint exposes OrmAI version, policy hash, audit sink status.
  • OrmAI version is pinned to a known-good release.
  • Rate-limiting is enabled in front of the agent endpoint (per IP, per session).
  • Structured logs flow to your observability stack.
  • Alerts are wired for: audit sink failures, policy denial spikes, write rate above baseline, statement timeouts.

LLM-side hygiene

  • System prompt instructs the agent to handle structured policy errors (scan_budget_exceeded, tenant_mismatch, etc.).
  • maxSteps / max_iterations is set on the agent loop (≤ 12).
  • Tool list given to the model is the actual OrmAI-generated list, not a hand-curated subset.
  • The model is not told the tenant ID in the prompt. It comes from context only.
  • The agent’s natural-language output is logged separately from the tool audit, with its own scrubbing.

Compliance specifics

  • The policy file is the artifact for “what can the agent see/do?” — and it can be handed to an auditor as-is.
  • Audit log queries for the common SOC 2 / ISO controls are documented as views.
  • Data subject access request (GDPR) can be answered using the audit log: “every operation involving subject X.”
  • Data deletion propagates to audit retention rules where required.
  • Vendors with data access are listed in your DPA — including OrmAI (which is open source and runs in-process, so probably no DPA needed, but verify with your legal team).

Pre-launch verification

Run these in a staging environment with realistic data:

  1. Cross-tenant probe. From an agent acting as tenant A, try every tool with tenant_id: B in arguments. Every one should return tenant A’s data.
  2. PII probe. Query every model with db.get. Verify masked / denied fields are masked / absent.
  3. Budget probe. Issue an unbounded query. Confirm the error is structured.
  4. Write probe. Try every write tool without reason. Confirm denial.
  5. Approval probe. Trigger an approval-gated write. Confirm it enters pending state.
  6. Audit probe. Confirm every tool call wrote an audit row. Confirm denied calls also wrote one.

Quarterly review

  • Re-read the policy file. Does it still match the agent’s intended capabilities?
  • Look at the audit log for the last 90 days. What’s the most common denial? Is the policy too tight or is the agent misbehaving?
  • Look at the longest-running tool calls. Are they the ones you’d expect?
  • Look at the write log. Does every write have a coherent reason?
  • Run the regression suite.

Found a typo or want to suggest a topic? Email [email protected].