When something works twice across builds, it becomes a checklist. When it works five times, it earns a name. Below are the frameworks I actually run before AI-generated code reaches a branch — not aspirational ones.
Current · stable
5-Check
A pre-merge check for AI-generated code. Trust, verify, override before merge. Five gates between "agent finished the task" and "this lands on main".
v1 · stable
AI lets you ship code faster than you can read it. The 5-Check is the part where you read it.
01
Task list
Before reviewing a diff, reconcile what was actually agreed. Did the agent stick to the brief, or did it solve an adjacent problem? If the task expanded, did it expand for a good reason? This is the cheapest place to catch scope drift.
Does the diff match the task that was given? If not, is the divergence improvement or drift?
02
Review commands
What did the agent actually run? Shell history, file writes, git ops, network calls. If it touched things you didn't expect — package managers, env files, secrets paths — that's a signal before it's a diff.
Any commands the agent ran that weren't necessary for the task?
03
Security check
Secrets in logs, expanded auth scopes, new outbound dependencies, surface area changes. AI generates plausible patterns from training data — and a lot of plausible auth code is wrong. Treat every new credential touchpoint with suspicion.
What's the new attack surface introduced by this diff? Who owns it?
04
Manual testing
Run the unhappy path by hand. Pass it a bad input, a slow network, a permission failure. AI-written code passes its own tests; manual testing is where you discover the cases the AI never considered.
What did the agent not test? Test that.
05
PR + CI review
Final pass. Read the diff like you didn't write it. Check what CI is checking, and what it isn't. If your CI doesn't catch the failure modes the agent introduced — fix CI before fixing the PR.
If this PR were submitted by a contractor, would you merge it?
Current · evolving
AI-Assisted Development Loop
The outer loop the 5-Check sits inside. Task boundaries, agent pairing, the 5-Check, and a feedback step that updates the loop itself.
v2 · evolving
01Frame the task small enough to verify
02Pair with the agent, watch its commands
03Run the 5-Check before any merge
04Merge and observe in production
05Capture what broke into the next loop
The loop is the part most teams skip. They optimize the agent and the merge gate but never close the feedback — so the same failure mode keeps slipping through. The fifth step matters more than the first.
Drafting
What's next.
Frameworks I'm using but haven't written down yet — they'll land here when they survive another month of real work.
DraftingMCP server boundary checklistWhat belongs in the MCP layer vs. the app it wraps
DraftingParallel-agent worktree patternRunning multiple Claude Code sessions on one repo, safely
NotesToken-budget design for agent flowsDesigning flows so you don't burn 10k tokens before message one