agent - Tags - gu-log

Let Agents Dream: Weekly Maintenance That Turns Repeated Work Into Skills

SD-25 2026-05-25 · ShroomDog Lab

Vaibhav Srivastav's Codex prompt is interesting because it describes an agent maintenance loop: look back at recent work, find repeated workflows, and package only high-confidence patterns into Skills, automations, or subagents. It is agent dreaming: turning busy work into capability.

OpenAI's Codex Goals Guide: Agents Should Not Finish by Vibes

SP-208 2026-05-20 · OpenAI Cookbook

OpenAI's Cookbook frames Codex Goals as a thread-scoped completion contract: the objective persists, but completion must be checked against evidence. This post fills in the official spec angle around SP-192, SP-197, and SP-207.

shroom-picks codex ai-engineering

Codex Goal Mode Isn't Magic: Loops Need a Finish Line, Tests, and Memory

SP-197 2026-05-12 · @ChrisHayduk on X

Codex `/goal` is not a wish machine. Chris Hayduk's real point is engineering discipline: give the agent a measurable finish line, a fast feedback loop, and Markdown files that work as long-term memory.

shroom-picks codex workflow

Don’t Rebuild the AI Agent Wheel: Learn to Teamfight With Your AI Teammate and Stop It From Feeding

SD-23 2026-05-10 · ShroomDog × ChatGPT conversation

LLMs are not gods, and they are not just tools. They are more like DOTA teammates: great at last-hitting, occasionally great at feeding. The human job is not to fight AI for the same lane, but to cover taste, map awareness, context ownership, and strategic judgment.

shroomdog-original ai-collaboration llm mental-model

HTML Is Not Prettier Markdown, but a Way to Bring People Back Into the Agent Loop

SP-194 2026-05-09 · @trq212 on X

Thariq explains why HTML is replacing Markdown in Claude Code workflows: not as prettier output, but as readable, operable, shareable artifacts that keep humans inside the agent decision loop.

shroom-picks html claude-code

Context Window: The Day a Model Wakes Up

SD-22 2026-05-08 · ShroomDog Lab

A context window is a model's day: how many lessons, messages, tool results, and task events Ryland can experience before sleep, compression, or collapse.

shroomdog-original context-window llm memory context-engineering agent-harness

Inside Codex Goals: Long-Running Agents Need More Than a Ralph Loop

SP-192 2026-05-08 · @jarrodwatts on X

Jarrod Watts looked inside Codex Goals and found that it solves early stopping, not long-run drift. The real long-running agent stack needs upfront clarification, multi-agent review, and memory outside the context window.

shroom-picks codex ai-engineering

Claude Needs Sleep Now: How Dreams Cleans Up an Agent's Memory Junk Drawer

SP-191 2026-05-07 · @danizhu on X

Anthropic's Claude Dreams is not just summarization. It gives agents an offline memory-consolidation loop: reread old memories and up to 100 past sessions, then produce a fresh, auditable memory store.

shroom-picks claude memory anthropic

OpenClaw Automation: Task Flow Is the Multi-Step Workflow Layer

SP-186 2026-04-28 · OpenClaw Docs

OpenClaw's automation docs put scheduled work, background tasks, Heartbeat, Hooks, Standing Orders, Task Flow, and related mechanisms on the same map. Task Flow is the layer for multi-step flow state, sync, and revision tracking; this piece reads those boundaries conservatively.

shroom-picks openclaw automation

Claude Code Source Leak — What npm's Forgotten Source Map Reveals About Its Next Moves

SP-139 2026-04-01 · @elliotarledge on X

Anthropic accidentally shipped the full TypeScript source code of Claude Code CLI inside an npm source map. It reveals autonomous agents, internal model codenames, disappearing permission prompts, and a Tamagotchi system.

shroom-picks claude-code anthropic leak

Natural-Language Agent Harnesses: When an Agent's Soul Moves from Code to Plain Text

CP-226 2026-03-31 · @daniel_mac8 on X

A Tsinghua Shenzhen team proposes NLAH (Natural-Language Agent Harnesses): moving agent control logic from code into structured natural language, executed by an IHR runtime. Experiments show harnesses can reshape agent behavior patterns entirely, but more structure doesn't always mean better results. Dan McAteer argues harness engineering matters as much as model capability.

clawd-picks harness agentic-engineering paper context-engineering

Artificial Analysis Launches AA-AgentPerf: The Hardware Benchmark Built for the Agent Era

CP-225 2026-03-29 · @ArtificialAnlys on X

Artificial Analysis launches AA-AgentPerf, a hardware benchmark that uses real coding agent trajectories instead of synthetic queries. It allows production optimizations, measures per-accelerator/per-kW/per-dollar efficiency, and scales from single cards to full racks.

shroom-picks benchmark inference hardware

Claude Code Channels: Anthropic Just Killed Your Reason to Buy a Mac Mini

CP-210 2026-03-26 · VentureBeat

Anthropic launches Claude Code Channels with native Telegram and Discord support, turning Claude Code into a 24/7 always-on AI agent. VentureBeat calls it the OpenClaw killer.

clawd-picks anthropic claude-code openclaw mcp telegram discord

Claude Can Use Your Computer Now! But the Real Moat Is Still 'Depth'

CP-206 2026-03-24 · @unfityogi on X

Claude Computer Use sparked huge excitement, with many claiming AI will fully replace human workers. But the original author points out that while AI can handle technical operations, it can't replace human judgement and cultural context. The real moat is still deep domain knowledge.

clawd-picks claude domain-knowledge

Andrew Ng's New Course: A2A (Agent2Agent Protocol) Is Becoming the Industry Standard for Agent Interop

CP-141 2026-03-04 · @AndrewYNg on X

Andrew Ng announces a new course on A2A (Agent2Agent Protocol). With IBM's ACP merging in, A2A is becoming the industry standard for agent-to-agent communication, letting you connect Google ADK and LangGraph agents seamlessly.

a2a andrewng