agent-harness
11 articles
Agent Memory Is Not Just Better RAG: What Grep and AKBP Are Really Saying
An arXiv paper found that inline grep often beats vector retrieval on long-memory conversational QA, while AKBP turns agent memory into a local-first, review-gated, file-backed protocol. Together, they point to the same lesson: agent memory is not a search feature. It is systems engineering.
AI Coding in Large Codebases Is Not Won by the Model Alone
Whether Claude Code works inside a large codebase is not just about model scores. The real question is whether the team has built rails for the agent: maps, automation, on-demand tools, symbol navigation, internal-system access, and someone to maintain the whole operating setup.
Codex Is Becoming the Runtime Kernel for AI Agents
OpenClaw and Hermes are both handing low-level coding-agent execution to Codex app server. This is not just a model switch. It is the agent product stack separating model, execution engine, and chat surface.
Meta-Meta-Prompting: Garry Tan's Second Brain Is Not a Chatbot. It's a Personal Operating System That Compounds
Garry Tan argues that personal AI becomes powerful only when it stops acting like a chat window and starts acting like an operating system: book mirrors, meeting prep, skill-generating skills, a thin harness, fat skills, and fat personal data that compounds over time.
Context Window: The Day a Model Wakes Up
A context window is a model's day: how many lessons, messages, tool results, and task events Ryland can experience before sleep, compression, or collapse.
`hermes claw migrate`: When One Agent Harness Writes a Moving Guide to Another
Hermes Agent and OpenClaw shipped big releases the same day. Hermes v0.10.0 hid a command called `hermes claw migrate` — it imports OpenClaw's config, memory, and API keys in one shot. ShroomDog compared both codebases: one grows its own brain, one rents pi-mono. Stay or move?
One `message Romain` prompt runs the whole workflow — OpenAI DevX demos Codex Chronicle, but the costs the tweet skipped matter too
OpenAI DevX's Dominik Kundel says: now that Codex has memories, plugins, and the newly-dropped Chronicle, he no longer packages context for AI — one line 'sync docs + message Romain' reads a Google Doc, edits markdown, opens a PR, and DMs the right person on Slack. Very nice. But the three costs written into official Chronicle docs were not in the tweet: macOS screen-recording permission, memories stored unencrypted on device, prompt injection risk amplified. Chronicle is a screen-recording agent, not a harmless booster.
Your 'AI-First' Is Probably Fake: How a 25-Person Agent Company Tore Down and Rebuilt Its Engineering Pipeline
A 25-person agent platform tore down its engineering pipeline and rebuilt it around one idea: agents are the primary builders. Result: 3-8 prod deploys a day, bad features killed same-day, six-week cycles now land in hours. Harness engineering, applied.
Harrison Chase Says You Don't Own Your Memory Without an Open Harness — gu-log Is a Counterexample
LangChain CEO Harrison Chase argues that agent harnesses are tied to memory, and using a closed harness means surrendering memory ownership to a third party. The argument has merit, but the conclusion is too crude — gu-log runs both a closed-source harness (Claude Code) and an open-source one (OpenClaw), with all memory stored as plain text in its own git repo. The real lock-in isn't about harness licensing — it's about memory format.
Agent Harness Engineering: How OpenAI Built a Million Lines of Code With Zero Human-Written Code
OpenAI's team let Codex write a million lines of code over five months — zero human-written code. This post explores how they built the scaffolding and feedback loops (the 'harness') that turned software engineers from code writers into environment designers.
Agent Harness Is the Real Product: Why Every Top Agent Architecture Looks the Same
Everyone's chasing the strongest Model, but the real difference-maker for Agents is the Harness. This post breaks down the shared architecture of Claude Code, Cursor, Manus, and SWE-Agent. The key insight: Progressive disclosure is the make-or-break for production agents.