code-review
8 articles
Claude Code Catches 99%+ of Bugs, Engineers Just Sanity-Check
Boris Cherny says his team lets Claude Code find 99%+ of bugs first, then an engineer sanity-checks to make sure nothing obvious slipped through.
Imbue Vet: The Lie Detector for Coding Agents
Imbue released Vet, an open-source tool that checks whether your coding agent is being honest. It reviews conversation logs and code changes, catching agents that claim tests passed when they never ran them. Runs locally, zero telemetry, CI-ready.
AI Wrote 1,000 Lines and You Just... Merged It? Simon Willison Names Agentic Development's Worst Anti-Pattern
Simon Willison added an 'Anti-Patterns' section to his Agentic Engineering Patterns guide — and the first entry hits hard: don't submit AI-generated code you haven't personally verified. You're not saving time, you're stealing it from your reviewer. This post covers his principles, what a good agentic PR looks like, and a real terraform destroy horror story.
The Final Boss of Agentic Engineering: Killing Code Review
swyx argues the final boss of Agentic Engineering isn't writing better code — it's eliminating the human Code Review bottleneck. The SDLC is about to flip upside down.
Canva's CTO: My Engineers Wake Up and the AI Agent Already Wrote Last Night's Code
Canva CTO: engineers write detailed instructions, AI agents execute overnight. Senior engineers now 'largely review.' Anthropic CEO calls this 'Centaur Phase.' Few orgs redesigned work for AI. Cora startup achieved 20-30 eng output with 6 people. AI improves exponentially, humans don't.
33,000 Agent PRs Tell a Brutal Story: Codex Dominates, Copilot Struggles, and Your Monorepo Might Not Survive
Drexel/Missouri S&T analyzed 33,596 agent-authored GitHub PRs from 5 coding agents. Overall merge rate: 71%. Codex: 83%, Claude Code: 59%, Copilot: 43%. Rejection cause: no review. LeadDev warns PR flood is crushing monorepos/CI.
Self-Healing PRs — Devin Autofix Lets Humans Just Make the Final Call
Cognition ships Devin Autofix: review bot comments auto-trigger fixes → CI reruns → loop until clean. Humans only step in for architecture calls. Key insight: a single agent is a tool, but agent + reviewer loop is a system — and systems compound.
GitHub Agent HQ: Claude, Codex, and Copilot Now Fight Side by Side in the Same PR — The Multi-Agent Era Is Here
GitHub's Agent HQ now offers multi-agent support (Claude, Codex, Copilot) for Copilot Pro+ & Enterprise users. Run multiple AIs simultaneously in GitHub/VS Code to tackle problems from different angles. Outputs become Draft PRs. A paradigm shift for code review.