ShroomDog Original

Original content by ShroomDog

17 posts

← Back to home

Permission Engineering — When Your AI Agent's Ceiling Isn't Intelligence, It's the Keys You Hand Over

Being a GenAI App Engineer increasingly feels like being a Permission Engineer. AI agents' capability ceiling isn't intelligence — it's how much access you're willing to grant. Every additional permission amplifies both power and risk. This piece explores why permission management is the most underrated core skill of the AI agent era.

Can AI Test Itself? — From Claude Code's Zero Tests to Self-Testing Agents

Claude Code: 512K lines of TypeScript, 64K lines of production code, zero tests. But the more interesting question isn't why Anthropic skipped tests — it's why they didn't use their own AI coding tool to write them. Static analysis, MITM proxies, cross-model testing, and the philosophical trap of asking the same brain to write the exam and grade it.

Undercover Mode Asked a Question Nobody Wants to Answer

Hidden inside Claude Code's leaked source was a ~90-line file called undercover.ts — designed to make AI commits look like human commits. This surfaces a question the industry hasn't agreed on: when AI writes your code, should anyone know?

The AI Agent Initiative Problem — When Should an Agent Act on Its Own?

You spent months building a powerful AI agent. It just sits there waiting for you to say something. That's not a technical problem — it's a design philosophy problem. From KAIROS's Heartbeat Pattern to OpenClaw's background sessions, this is about when to let your agent decide to act on its own.

Prompt Cache Economics — Why Your AI Bill Is Higher Than You Think

Prompt caching should save you 90% on token costs — but one obscure bug can silently make you pay 10x more. From DANGEROUS_uncachedSystemPromptSection to the cch=00000 billing trap hidden in Claude Code's DRM, here's why prompt engineers now need to be accountants too.

5 Bad Design Patterns from the Claude Code Source Leak

The Claude Code source leak had everyone excited about KAIROS and model codenames. But the same codebase had a 3,167-line function, zero tests, silent model downgrades, and regex emotion detection. These aren't just Anthropic's mistakes — they're AI-generated code's default failure modes.

How We Made 336 AI-Generated Posts Actually Worth Reading

gu-log had 336 AI-translated posts. We thought they were 'fine' — until we built a multi-agent scoring system and discovered 74% needed rewriting. This is the story of how we designed the eval, ran it overnight, and what we learned.

Claude Code CLI's Deep Thinking Philosophy: Why I'm Your Most Trusted AI Architect

The core philosophy of Claude Code CLI: think first, act later. From SWE-bench performance evolution, Plan Mode, Extended Thinking, Multi-Agent architecture, to WebSearch capabilities. Opus used WebSearch inside a secure Podman container to research its own latest features and community reviews, with 11 reference links.

12 Levels in 2 Days: Learning Full-Stack Quality Metrics RPG-Style with AI

A Tech Lead uses his own blog as a training ground, spending two days learning 12 quality metrics with an AI tutor using RPG-style Level-Up teaching — from npm audit to LLM-as-Judge — while sub-agents implement everything in parallel. The real takeaway isn't the metrics, but a replicable methodology for AI-assisted learning.

Sub-Agent Showdown: Claude Code vs OpenClaw — Whose Shadow Clone Jutsu Is Stronger?

Claude Code's Subagents and OpenClaw's sessions_spawn both let AI delegate work to clones, but their design philosophies couldn't be more different. One is an in-process coworker in your local dev tool; the other is a fully isolated field agent in a distributed messaging system. Full comparison across architecture, configuration, communication, tool permissions, and real-world scenarios.

Using AI to Manage AI: Building a Telegram Agent with OpenClaw

ShroomDog's internal tech talk on building a three-layer AI system: Telegram + Claude Code + VPS. Covers live demos, security architecture, the 6-phase setup journey, Auth Profile Rotation, Stealth Mode, a classic debugging detective story, cost breakdown, gotchas, and Q&A.