📚 ShroomDog Picks

Long-form articles, translated and explained

204 posts

← Back to home

Autobrowse: What Browser Agents Really Lack Is Not Brains, but Handoff-Ready Memory

Kyle Jeong introduces Browserbase's internal Autobrowse: browser agents repeatedly execute tasks on real websites, study their own traces, and graduate successful paths into readable, auditable, reusable skills.

Ghostty Is Leaving GitHub: When User #1299 — an 18-Year True Believer — Says 'I Can't Do This Anymore'

Mitchell Hashimoto — HashiCorp co-founder, Vagrant author, GitHub user #1299 — announces that Ghostty is leaving GitHub. He's been on GitHub for 18 years. He committed code on his honeymoon while his wife was asleep. What finally pushed him out wasn't a philosophical fight — it was a one-month journal where he marked an X every time GitHub broke his workflow, plus a 2-hour PR review block from a GitHub Actions outage on the day he wrote the post.

OpenClaw Automation: Task Flow Is the Multi-Step Workflow Layer

OpenClaw's automation docs put scheduled work, background tasks, Heartbeat, Hooks, Standing Orders, Task Flow, and related mechanisms on the same map. Task Flow is the layer for multi-step flow state, sync, and revision tracking; this piece reads those boundaries conservatively.

Building Products for Agents — A Ramp PM Starts With a Convenience-Store Spoon

After Ramp's MCP grew 10x WAU and Salesforce shipped Headless 360, PM Teddy says UI isn't dead — but 80% of software is flipping to agents. The piece starts from one detail (why Notion's MCP feels orders of magnitude better than Slack's) and pulls the whole new architecture into view.

The three bugs behind Claude Code feeling dumber in April — Anthropic's own postmortem

Anthropic just published a postmortem confirming Claude Code really did feel dumber this past month — not one bug, but three independent changes rolling out on different schedules that stacked into what looked like a broad regression. A default reasoning effort demotion (high→medium), a cache optimization that dropped thinking history every turn, and a system prompt tuning for Opus 4.7 verbosity that cost 3% on evals. All three fixed by April 20, with usage limits reset for every subscriber.

The Honest Multi-Agent Report, 10 Months Later — Cognition's Walden: Keep Writes Single-Threaded, Let Other Agents Pour In Intelligence

Ten months after writing Don't Build Multi-Agents, Cognition's Walden Yan returns with three patterns that actually ship: Devin Review's clean-context loop (2 bugs per PR, ~58% severe), cross-frontier smart friends, and manager Devin's map-reduce-and-manage. One principle runs through all three — writes stay single-threaded; other agents contribute intelligence, not actions.

Why Production Agents Converge on MCP — Anthropic's Breakdown of API vs CLI vs MCP

Anthropic's guide to connecting production agents to real systems. When agents move to the cloud, API / CLI / MCP all ship — only MCP compounds. Uses Cloudflare's MCP server (2 tools, ~2,500 endpoints, ~1K tokens) as the benchmark for remote-first design, intent-grouped tools, and production auth.

Every Agent Needs a Bouncer: Brex Open-Sources CrabTrap, an LLM-Judge HTTP Proxy for Production Agents

Brex open-sources CrabTrap — an HTTP proxy that intercepts every outbound agent request. Static rules dispatch known patterns in microseconds; the long tail goes to an LLM judge. Policies are inferred from traffic, not hand-written. Three prod surprises: inferred policies beat written ones, LLM fires on <3% of requests, audit log became agent observability.

One `message Romain` prompt runs the whole workflow — OpenAI DevX demos Codex Chronicle, but the costs the tweet skipped matter too

OpenAI DevX's Dominik Kundel says: now that Codex has memories, plugins, and the newly-dropped Chronicle, he no longer packages context for AI — one line 'sync docs + message Romain' reads a Google Doc, edits markdown, opens a PR, and DMs the right person on Slack. Very nice. But the three costs written into official Chronicle docs were not in the tweet: macOS screen-recording permission, memories stored unencrypted on device, prompt injection risk amplified. Chronicle is a screen-recording agent, not a harmless booster.