agent.run_query UPDATE users SET plan='free';
blocked: true
reason: WRITE_WITHOUT_WHERE
plain: this rewrites every row
fix: add WHERE id = … or a scope
agent-native database safety
The missing runtime layer between AI agents and Postgres.
Interdict is built for the moment an autonomous agent is about to touch production data. It parses every statement, measures the real blast radius, then allows, holds, or blocks before Postgres is touched.
- first agent-native runtime layer for SQL blast radius + undo
- ~0 ms added latency on the safe path
- 588 agent trials in the recovery/evasion study
why it exists
Database permissions can grant access. They can't measure damage.
A Postgres role can say whether someone may touch the orders table. It cannot say whether this exact statement changes one row or two million — or whether the change can be reversed without restoring a full backup and throwing away everyone else's work.
That missing layer is where real accidents live: the unscoped UPDATE, the agent that misreads its own task, the DELETE that was meant for staging. Interdict is the agent-native runtime layer for exactly that gap.
see it work
Three things you'll watch it do.
Block a dangerous write, preview a write's true size, and undo a committed one.
agent.run_query DELETE FROM clients
WHERE active = true;
confirm write
blast radius: 2,300,000 rows (precise)
reversible: yes — undo id kept
execute? [y/N]
agent.run_query UPDATE accounts
SET balance=0 WHERE id=1;
✓ UPDATE 1 undo_id 3811adb4
operator revert_write(3811adb4)
✓ reverted — 1 row restored
how it works
Four steps between a query and your data.
The whole decision is deterministic and runs in memory, so the safe path stays fast.
-
一
Parse
A real Postgres parser, not regex — so comments, casing, aliases, and stray semicolons are read structurally, not guessed.
-
二
Classify
Reads, writes, DDL, catalog access, risky functions, and dangerously broad scopes are each identified by shape.
-
三
Simulate
Risky writes run inside a time-boxed throwaway transaction that is rolled back, returning the real affected-row count.
-
四
Decide
Allow, hold for confirmation, or block with a fix — and record an undo path whenever the write can be safely reversed.
Blast-radius preview
Impact is measured against the live database, not inferred from the SQL text.
Instant undo
Reverse one bad action by id — no full restore, no collateral data loss.
Structured rejection
Agents get reason codes and fixes they can reason over without direct database credentials.
Negligible overhead
Deterministic checks run in microseconds; the benchmark reports ~0 ms end to end.
start here
Get comfortable on the bundled test database.
Install Interdict, start the seeded Postgres fixture, connect Claude or Codex, then watch real agent prompts get allowed, held, undone, and blocked.
step 0
Install Interdict
Install the MCP server your agent will call before touching Postgres.
The interdict command is available. This is the MCP layer for Claude, Codex, Cursor, or any MCP-capable agent.
The PyPI package is interdict-db; the command users run is interdict.
pip install interdict-db
# confirm the command exists
interdict --help
Write normal product requests.
Do not train users to say “use Interdict.” If Interdict is connected in the chat, database work inside larger Claude/Codex tasks should route through it automatically.
Ask whether it is active.
Use /mcp in Codex, or ask the agent to call interdict_status. Active means this chat has the database safety layer connected.
The model cannot approve itself.
Large writes return an approval_id and wait. Approval is out-of-band with an operator token the agent should not see.
test prompts
Paste these into Claude or Codex after MCP is connected.
They use the bundled Pagila + large-table database so users can see the product behave before pointing it at anything real.
Find actor_id 1 in the test database and summarize the row. If database access is needed, use the active database safety layer.
Expected: allowed read, rows returned, no write.For the demo database, update actor_id 1 by setting last_update = last_update, then show me the undo_action_id and revert it.
Expected: UPDATE 1, undo id, then successful revert.Clean up metric_sample rows where sensor_id <= 200. Do not approve anything. Tell me the blast radius and approval_id if it is held.
Expected: held for approval when impact cannot be safely bounded or exceeds confirmation policy.Delete rows from metric_sample where sensor_id <= 2000 and explain exactly why the safety layer blocked or held it.
Expected: blocked forBLAST_RADIUS_EXCEEDED; no rows deleted.
the numbers
What it measures, measured locally.
Closed-loop recovery/evasion study across 3 tasks, 4 conditions, and 3 Claude model families.
Bulk-request tasks reliably hit the guardrail instead of silently executing broad writes.
Parser behavior, policy, simulation, audit redaction, undo, MCP enforcement, and terminal mode.
Dangerous SQL was fully blocked locally; green-corpus false positives were 0% too.
where it stands
The first agent-native safety layer for database writes.
Not a dashboard. Not an ORM. Interdict is a runtime guardrail placed directly between AI agents and Postgres, built around deterministic checks, latency budgets, auditability, and fail-closed writes.