Interdict — the runtime layer between AI agents and Postgres

agent-native database safety

The missing runtime layer between AI agents and Postgres.

Interdict is built for the moment an autonomous agent is about to touch production data. It parses every statement, measures the real blast radius, then allows, holds, or blocks before Postgres is touched.

Run it locally Read the source

first agent-native runtime layer for SQL blast radius + undo
~0 ms added latency on the safe path
588 agent trials in the recovery/evasion study

agent → interdict → postgres

agent.run_query UPDATE users SET plan = 'free';

✕ blocked — WRITE_WITHOUT_WHERE
This UPDATE has no WHERE clause; it would rewrite
every row in users.

suggested_fix  add a WHERE that scopes the rows
retryable      true

why it exists

Database permissions can grant access. They can't measure damage.

A Postgres role can say whether someone may touch the orders table. It cannot say whether this exact statement changes one row or two million — or whether the change can be reversed without restoring a full backup and throwing away everyone else's work.

That missing layer is where real accidents live: the unscoped UPDATE, the agent that misreads its own task, the DELETE that was meant for staging. Interdict is the agent-native runtime layer for exactly that gap.

see it work

Three things you'll watch it do.

Block a dangerous write, preview a write's true size, and undo a committed one.

1 · it blocks, with a fix

agent.run_query UPDATE users SET plan='free';

blocked: true
reason: WRITE_WITHOUT_WHERE
plain: this rewrites every row
fix: add WHERE id = … or a scope

2 · it shows the blast radius

agent.run_query DELETE FROM clients
          WHERE active = true;

confirm write
blast radius:  2,300,000 rows  (precise)
reversible:    yes — undo id kept

execute? [y/N]

3 · it puts things back

agent.run_query UPDATE accounts
          SET balance=0 WHERE id=1;
✓ UPDATE 1   undo_id 3811adb4

operator revert_write(3811adb4)
✓ reverted — 1 row restored

how it works

Four steps between a query and your data.

The whole decision is deterministic and runs in memory, so the safe path stays fast.

一
Parse

A real Postgres parser, not regex — so comments, casing, aliases, and stray semicolons are read structurally, not guessed.
二
Classify

Reads, writes, DDL, catalog access, risky functions, and dangerously broad scopes are each identified by shape.
三
Simulate

Risky writes run inside a time-boxed throwaway transaction that is rolled back, returning the real affected-row count.
四
Decide

Allow, hold for confirmation, or block with a fix — and record an undo path whenever the write can be safely reversed.

Blast-radius preview

Impact is measured against the live database, not inferred from the SQL text.

Instant undo

Reverse one bad action by id — no full restore, no collateral data loss.

Structured rejection

Agents get reason codes and fixes they can reason over without direct database credentials.

Negligible overhead

Deterministic checks run in microseconds; the benchmark reports ~0 ms end to end.

start here

Get comfortable on the bundled test database.

Install Interdict, start the seeded Postgres fixture, connect Claude or Codex, then watch real agent prompts get allowed, held, undone, and blocked.

step 0

Install Interdict

Install the MCP server your agent will call before touching Postgres.

what good looks like

The interdict command is available. This is the MCP layer for Claude, Codex, Cursor, or any MCP-capable agent.

The PyPI package is interdict-db; the command users run is interdict.

pip install interdict-db

# confirm the command exists
interdict --help

how to prompt

Write normal product requests.

Do not train users to say “use Interdict.” If Interdict is connected in the chat, database work inside larger Claude/Codex tasks should route through it automatically.

how to know

Ask whether it is active.

Use /mcp in Codex, or ask the agent to call interdict_status. Active means this chat has the database safety layer connected.

how approvals work

The model cannot approve itself.

Large writes return an approval_id and wait. Approval is out-of-band with an operator token the agent should not see.

test prompts

Paste these into Claude or Codex after MCP is connected.

They use the bundled Pagila + large-table database so users can see the product behave before pointing it at anything real.

allowed read

Find actor_id 1 in the test database and summarize the row. If database access is needed, use the active database safety layer.

Expected: allowed read, rows returned, no write.

undoable write

For the demo database, update actor_id 1 by setting last_update = last_update, then show me the undo_action_id and revert it.

Expected: UPDATE 1, undo id, then successful revert.

held write

Clean up metric_sample rows where sensor_id <= 200. Do not approve anything. Tell me the blast radius and approval_id if it is held.

Expected: held for approval when impact cannot be safely bounded or exceeds confirmation policy.

blocked write

Delete rows from metric_sample where sensor_id <= 2000 and explain exactly why the safety layer blocked or held it.

Expected: blocked for BLAST_RADIUS_EXCEEDED; no rows deleted.

the numbers

What it measures, measured locally.

agent trials588

Closed-loop recovery/evasion study across 3 tasks, 4 conditions, and 3 Claude model families.

denial trigger rate99%

Bulk-request tasks reliably hit the guardrail instead of silently executing broad writes.

automated tests311

Parser behavior, policy, simulation, audit redaction, undo, MCP enforcement, and terminal mode.

red-corpus false negatives0%

Dangerous SQL was fully blocked locally; green-corpus false positives were 0% too.

where it stands

The first agent-native safety layer for database writes.

Not a dashboard. Not an ORM. Interdict is a runtime guardrail placed directly between AI agents and Postgres, built around deterministic checks, latency budgets, auditability, and fail-closed writes.

Request access Email the founder