code-review - Tags

Should Humans Still Understand Agent-Written Code? Yes — But Not Just to Verify It

GP-249 2026-07-03 · @geoffreylitt on X

Geoffrey Litt asks a sharp question for the agent era: if agents can write and verify more code by themselves, why should humans still understand the code? His answer is that understanding is not only for verification. It is how humans keep participating.

Writing Code Stopped Being the Bottleneck: The Era of Verifying Code Like a Black Box

MP-309 2026-06-17 · @rahulgs on X

Code-writing models are 'English to code' interpreters — writing code is no longer the bottleneck; reviewing and merging it safely is. Treat low-risk code as a black box and verify empirically; save line-by-line review for what can hurt you. Claude Code's creator Boris Cherny agrees.

mogu-picks claude-code agentic-engineering ai-coding

When an Agent Writes 1500 Lines at Once, That's the Warning: Cut the Feature Until You Can Actually Review It

GP-229 2026-06-16 · @mitchellh on X

Mitchell Hashimoto's blunt rule for agent coding: any diff over ~1500 lines is too big — a signal to cut the problem up. First let the agent sloppily draw an owl, then break the mess into atomic tasks, hand-massage the shape, and re-run in parallel — pushing every change below your review threshold.

shroom-picks ai-agents agent-workflow

Code Got Cheap. Trusting It Did Not.

GP-230 2026-06-16 · @addyosmani on X

The 2026 data all points one way: AI pushes raw code output up about 4x, but real delivered value only rises about 10%. The gap in between is all review debt. Writing code got cheap; being sure it is right did not. Code review went from a side effect of engineering to its most leveraged front line.

shroom-picks ai software-engineering

Google's Code Review Guide: Don't Chase Perfect, Protect Code Health

GP-211 2026-05-24 · Google Engineering Practices (via @nini_incrypto_ on X)

Google Engineering Practices frames code review as code-health work, not a perfection ritual: approve CLs that improve the system, while aligning design, tests, speed, comments, and author habits around maintainability.

shroom-picks engineering-practices software-engineering

Claude Code Catches 99%+ of Bugs, Engineers Just Sanity-Check

MP-222 2026-03-29 · @bcherny on X

Boris Cherny says his team lets Claude Code find 99%+ of bugs first, then an engineer sanity-checks to make sure nothing obvious slipped through.

mogu-picks claude-code ai-workflow

Imbue Vet: The Lie Detector for Coding Agents

MP-161 2026-03-14 · @imbue_ai on X

Imbue released Vet, an open-source tool that checks whether your coding agent is being honest. It reviews conversation logs and code changes, catching agents that claim tests passed when they never ran them. Runs locally, zero telemetry, CI-ready.

vet ai-agents open-source

AI Wrote 1,000 Lines and You Just... Merged It? Simon Willison Names Agentic Development's Worst Anti-Pattern

MP-146 2026-03-09 · @simonw on X

Simon Willison's new Agentic Engineering anti-pattern hits hard: do not submit AI-generated code you have not personally verified. That is not saving time; it is stealing reviewer time. The post pairs principles with a terraform destroy horror story.

simon-willison agentic-coding simonw-agentic-patterns anti-patterns ai-agents best-practices

The Final Boss of Agentic Engineering: Killing Code Review

MP-140 2026-03-03 · @swyx on X

swyx argues the final boss of Agentic Engineering isn't writing better code — it's eliminating the human Code Review bottleneck. The SDLC is about to flip upside down.

agentic-engineering sdlc

Canva's CTO: My Engineers Wake Up and the AI Agent Already Wrote Last Night's Code

MP-93 2026-02-18 · Business Insider (Tim Paradis)

Canva CTO: engineers write detailed instructions, AI agents execute overnight. Senior engineers now 'largely review.' Anthropic CEO calls this 'Centaur Phase.' Few orgs redesigned work for AI. Cora startup achieved 20-30 eng output with 6 people. AI improves exponentially, humans don't.

canva ai-agents overnight-coding centaur-phase dario-amodei tech-lead accenture engineering-culture productivity

33,000 Agent PRs Tell a Brutal Story: Codex Dominates, Copilot Struggles, and Your Monorepo Might Not Survive

MP-84 2026-02-16 · Drexel University / Missouri S&T (MSR 2026)

Drexel/Missouri S&T analyzed 33,596 agent-authored GitHub PRs from 5 coding agents. Overall merge rate: 71%. Codex: 83%, Claude Code: 59%, Copilot: 43%. Rejection cause: no review. LeadDev warns PR flood is crushing monorepos/CI.

research agentic-coding pull-requests ci-cd monorepo codex claude-code copilot tech-lead

Self-Healing PRs — Devin Autofix Lets Humans Just Make the Final Call

GP-66 2026-02-16 · @dabit3 (Nader Dabit) on X + Cognition Blog

Cognition ships Devin Autofix: review bot comments auto-trigger fixes → CI reruns → loop until clean. Humans only step in for architecture calls. Key insight: a single agent is a tool, but agent + reviewer loop is a system — and systems compound.

devin ci-cd agent-loop self-healing cognition

GitHub Agent HQ: Claude, Codex, and Copilot Now Fight Side by Side in the Same PR — The Multi-Agent Era Is Here

MP-82 2026-02-15 · GitHub Blog

GitHub's Agent HQ now offers multi-agent support (Claude, Codex, Copilot) for Copilot Pro+ & Enterprise users. Run multiple AIs simultaneously in GitHub/VS Code to tackle problems from different angles. Outputs become Draft PRs. A paradigm shift for code review.

github copilot claude-code codex multi-agent developer-tools agentic-coding