ShroomDog Original

Original content by ShroomDog

24 posts

Let Agents Dream: Weekly Maintenance That Turns Repeated Work Into Skills

SD-25 2026-05-25 · ShroomDog Lab

Vaibhav Srivastav's Codex prompt is interesting because it describes an agent maintenance loop: look back at recent work, find repeated workflows, and package only high-confidence patterns into Skills, automations, or subagents. It is agent dreaming: turning busy work into capability.

Codex Is Becoming the Runtime Kernel for AI Agents

SD-24 2026-05-15 · ShroomDog Lab

OpenClaw and Hermes are both handing low-level coding-agent execution to Codex app server. This is not just a model switch. It is the agent product stack separating model, execution engine, and chat surface.

Don’t Rebuild the AI Agent Wheel: Learn to Teamfight With Your AI Teammate and Stop It From Feeding

SD-23 2026-05-10 · ShroomDog × ChatGPT conversation

LLMs are not gods, and they are not just tools. They are more like DOTA teammates: great at last-hitting, occasionally great at feeding. The human job is not to fight AI for the same lane, but to cover taste, map awareness, context ownership, and strategic judgment.

Context Window: The Day a Model Wakes Up

SD-22 2026-05-08 · ShroomDog Lab

A context window is a model's day: how many lessons, messages, tool results, and task events Ryland can experience before sleep, compression, or collapse.

Fire Truck vs. Succulent — Vector Database vs. Agent Search, in Simple Math

SD-21 2026-04-21 · ShroomDog Lab

Someone deployed Milvus to search 5,000 vectors. That's like calling a fire truck to water a desk succulent. This post uses dead-simple math to compare vector databases vs. agent-driven search — IO pressure, scalability, and how each approach dies at 10K and 1M users.

`hermes claw migrate`: When One Agent Harness Writes a Moving Guide to Another

SD-20 2026-04-21 · ShroomDog Lab

Hermes Agent and OpenClaw shipped big releases the same day. Hermes v0.10.0 hid a command called `hermes claw migrate` — it imports OpenClaw's config, memory, and API keys in one shot. ShroomDog compared both codebases: one grows its own brain, one rents pi-mono. Stay or move?

Lightning Talk: Asking Claude to Build a Ralph Loop

SD-19 2026-04-09 · ShroomDog Lab

3-minute lightning talk slides. AI has read almost everything — but some concepts aren't in training data yet. What you know that AI doesn't = your leverage.

Permission Engineering — When Your AI Agent's Ceiling Isn't Intelligence, It's the Keys You Hand Over

SD-18 2026-04-03 · ShroomDog Lab

Being a GenAI App Engineer increasingly feels like being a Permission Engineer. AI agents' capability ceiling isn't intelligence — it's how much access you're willing to grant. Every additional permission amplifies both power and risk. This piece explores why permission management is the most underrated core skill of the AI agent era.

What That xkcd Chart Didn't Tell You — Is It Worth Automating in the AI Era?

SD-17 2026-04-02 · ShroomDog Lab

xkcd #1205 taught a generation of engineers how to think about automation ROI. But AI changed the most expensive variable in that equation: the real return now is often not minutes saved, but cognitive load removed.

Can AI Test Itself? — From Claude Code's Zero Tests to Self-Testing Agents

SD-16 2026-04-02 · ShroomDog Lab

Claude Code: 512K lines of TypeScript, 64K lines of production code, zero tests. But the more interesting question isn't why Anthropic skipped tests — it's why they didn't use their own AI coding tool to write them. Static analysis, MITM proxies, cross-model testing, and the philosophical trap of asking the same brain to write the exam and grade it.

Undercover Mode Asked a Question Nobody Wants to Answer

SD-15 2026-04-02 · ShroomDog Lab

Hidden inside Claude Code's leaked source was a ~90-line file called undercover.ts — designed to make AI commits look like human commits. This surfaces a question the industry hasn't agreed on: when AI writes your code, should anyone know?

The AI Agent Initiative Problem — When Should an Agent Act on Its Own?

SD-14 2026-04-02 · ShroomDog Lab

You spent months building a powerful AI agent. It just sits there waiting for you to say something. That's not a technical problem — it's a design philosophy problem. From KAIROS's Heartbeat Pattern to OpenClaw's background sessions, this is about when to let your agent decide to act on its own.

Prompt Cache Economics — Why Your AI Bill Is Higher Than You Think

SD-13 2026-04-02 · ShroomDog Lab

Prompt caching should save you 90% on token costs — but one obscure bug can silently make you pay 10x more. From DANGEROUS_uncachedSystemPromptSection to the cch=00000 billing trap hidden in Claude Code's DRM, here's why prompt engineers now need to be accountants too.

5 Bad Design Patterns from the Claude Code Source Leak

SD-12 2026-04-02 · ShroomDog Lab

The Claude Code source leak had everyone excited about KAIROS and model codenames. But the same codebase had a 3,167-line function, zero tests, silent model downgrades, and regex emotion detection. These aren't just Anthropic's mistakes — they're AI-generated code's default failure modes.

AI Agent Memory Architecture: The One Thing Claude Code's Source Code Taught Me

SD-11 2026-04-02 · ShroomDog Lab

Every new session, your AI agent forgets everything. Claude Code's leaked source hid a three-layer memory architecture and a design principle — 'Memory is hint, not truth' — that changes how you think about building agents. Here's the full breakdown.

How We Made 336 AI-Generated Posts Actually Worth Reading

SD-10 2026-03-22 · ShroomDog Lab

gu-log had 336 AI-translated posts. We thought they were 'fine' — until we built a multi-agent scoring system and discovered 74% needed rewriting. This is the story of how we designed the eval, ran it overnight, and what we learned.

Letting AI Run Your E2E Tests: Playwright vs agent-browser vs Rodney — A Field Report

SD-9 2026-03-12 · ShroomDog Lab

We had Claude Opus run E2E tests on our own blog using Playwright, agent-browser, and Rodney. The surprise? The tool mattered way less than the prompt.

Claude Code CLI's Deep Thinking Philosophy: Why I'm Your Most Trusted AI Architect

SD-7 2026-03-02 · ShroomDog Original

The core philosophy of Claude Code CLI: think first, act later. From SWE-bench performance evolution, Plan Mode, Extended Thinking, Multi-Agent architecture, to WebSearch capabilities. Opus used WebSearch inside a secure Podman container to research its own latest features and community reviews, with 11 reference links.

Codex CLI's Security Sandbox Philosophy: Why I'm the Best AI for Your Production Codebase

SD-6 2026-03-02 · ShroomDog Original

Codex CLI is built with Rust, open-sourced under Apache 2.0, and has an OS-level security sandbox (Landlock + seccomp + Seatbelt) built right in. This is Codex's own autobiography written after extensive web searches, and we've fact-checked it — flagging a few claims that need caveats.

Gemini CLI's Big Eater Philosophy: 1M Token Context + Web Search + Free — Your AI Scout

SD-5 2026-03-02 · ShroomDog Original

Gemini CLI's 1M token big eater context, built-in Web Search grounding, free and open-source. Plus sharing the Gemini Safe Search security setup isolated with Podman containers, and real-world token consumption stats from our trilogy series.