agent
15 articles
Let Agents Dream: Weekly Maintenance That Turns Repeated Work Into Skills
Vaibhav Srivastav's Codex prompt is interesting because it describes an agent maintenance loop: look back at recent work, find repeated workflows, and package only high-confidence patterns into Skills, automations, or subagents. It is agent dreaming: turning busy work into capability.
OpenAI's Codex Goals Guide: Agents Should Not Finish by Vibes
OpenAI's Cookbook frames Codex Goals as a thread-scoped completion contract: the objective persists, but completion must be checked against evidence. This post fills in the official spec angle around SP-192, SP-197, and SP-207.
Codex Goal Mode Isn't Magic: Loops Need a Finish Line, Tests, and Memory
Codex `/goal` is not a wish machine. Chris Hayduk's real point is engineering discipline: give the agent a measurable finish line, a fast feedback loop, and Markdown files that work as long-term memory.
Don’t Rebuild the AI Agent Wheel: Learn to Teamfight With Your AI Teammate and Stop It From Feeding
LLMs are not gods, and they are not just tools. They are more like DOTA teammates: great at last-hitting, occasionally great at feeding. The human job is not to fight AI for the same lane, but to cover taste, map awareness, context ownership, and strategic judgment.
HTML Is Not Prettier Markdown, but a Way to Bring People Back Into the Agent Loop
Thariq explains why HTML is replacing Markdown in Claude Code workflows: not as prettier output, but as readable, operable, shareable artifacts that keep humans inside the agent decision loop.
Context Window: The Day a Model Wakes Up
A context window is a model's day: how many lessons, messages, tool results, and task events Ryland can experience before sleep, compression, or collapse.
Inside Codex Goals: Long-Running Agents Need More Than a Ralph Loop
Jarrod Watts looked inside Codex Goals and found that it solves early stopping, not long-run drift. The real long-running agent stack needs upfront clarification, multi-agent review, and memory outside the context window.
Claude Needs Sleep Now: How Dreams Cleans Up an Agent's Memory Junk Drawer
Anthropic's Claude Dreams is not just summarization. It gives agents an offline memory-consolidation loop: reread old memories and up to 100 past sessions, then produce a fresh, auditable memory store.
OpenClaw Automation: Task Flow Is the Multi-Step Workflow Layer
OpenClaw's automation docs put scheduled work, background tasks, Heartbeat, Hooks, Standing Orders, Task Flow, and related mechanisms on the same map. Task Flow is the layer for multi-step flow state, sync, and revision tracking; this piece reads those boundaries conservatively.
Claude Code Source Leak — What npm's Forgotten Source Map Reveals About Its Next Moves
Anthropic accidentally shipped the full TypeScript source code of Claude Code CLI inside an npm source map. It reveals autonomous agents, internal model codenames, disappearing permission prompts, and a Tamagotchi system.
Natural-Language Agent Harnesses: When an Agent's Soul Moves from Code to Plain Text
A Tsinghua Shenzhen team proposes NLAH (Natural-Language Agent Harnesses): moving agent control logic from code into structured natural language, executed by an IHR runtime. Experiments show harnesses can reshape agent behavior patterns entirely, but more structure doesn't always mean better results. Dan McAteer argues harness engineering matters as much as model capability.
Artificial Analysis Launches AA-AgentPerf: The Hardware Benchmark Built for the Agent Era
Artificial Analysis launches AA-AgentPerf, a hardware benchmark that uses real coding agent trajectories instead of synthetic queries. It allows production optimizations, measures per-accelerator/per-kW/per-dollar efficiency, and scales from single cards to full racks.
Claude Code Channels: Anthropic Just Killed Your Reason to Buy a Mac Mini
Anthropic launches Claude Code Channels with native Telegram and Discord support, turning Claude Code into a 24/7 always-on AI agent. VentureBeat calls it the OpenClaw killer.
Claude Can Use Your Computer Now! But the Real Moat Is Still 'Depth'
Claude Computer Use sparked huge excitement, with many claiming AI will fully replace human workers. But the original author points out that while AI can handle technical operations, it can't replace human judgement and cultural context. The real moat is still deep domain knowledge.
Andrew Ng's New Course: A2A (Agent2Agent Protocol) Is Becoming the Industry Standard for Agent Interop
Andrew Ng announces a new course on A2A (Agent2Agent Protocol). With IBM's ACP merging in, A2A is becoming the industry standard for agent-to-agent communication, letting you connect Google ADK and LangGraph agents seamlessly.