codex
32 articles
Do Not Let Codex Teach You: Turn AI Into a Learning Coach in 5 Steps
When learning a new tool with Codex, the worst move is asking it to give you a lecture. A better pattern is to ask it for an entry point, a rough map, a tiny exercise, a teach-back check, and breadcrumbs for next time.
Let Agents Dream: Weekly Maintenance That Turns Repeated Work Into Skills
Vaibhav Srivastav's Codex prompt is interesting because it describes an agent maintenance loop: look back at recent work, find repeated workflows, and package only high-confidence patterns into Skills, automations, or subagents. It is agent dreaming: turning busy work into capability.
Codex Is No Longer Just for Code — It Is Becoming an Operating System for Computer Work
Codex is expanding from a coding assistant into a durable system for computer work: persistent threads, voice, steering, queuing, browser and desktop tools, automations, side-panel review, and shared memory all pull work from instruction toward execution and artifact review.
OpenAI's Codex Goals Guide: Agents Should Not Finish by Vibes
OpenAI's Cookbook frames Codex Goals as a thread-scoped completion contract: the objective persists, but completion must be checked against evidence. This post fills in the official spec angle around SP-192, SP-197, and SP-207.
An AI Agent Needs More Than a Goal
OpenAI and Anthropic both pushed /goal-like ideas into coding agents. A goal helps, but production agents also need strategy, constraints, health metrics, autonomy boundaries, and stop rules.
Codex Is Becoming the Runtime Kernel for AI Agents
OpenClaw and Hermes are both handing low-level coding-agent execution to Codex app server. This is not just a model switch. It is the agent product stack separating model, execution engine, and chat surface.
Codex CLI Memory Is Not Magic. It Is a Stack of Greppable Markdown
Mem0 breaks down Codex CLI memory: not a vector database, but local Markdown, background summaries, credential scrubbing, and grep search. This post looks at when local notes are enough, and when a semantic memory layer makes sense.
Codex Goal Mode Isn't Magic: Loops Need a Finish Line, Tests, and Memory
Codex `/goal` is not a wish machine. Chris Hayduk's real point is engineering discipline: give the agent a measurable finish line, a fast feedback loop, and Markdown files that work as long-term memory.
Inside Codex Goals: Long-Running Agents Need More Than a Ralph Loop
Jarrod Watts looked inside Codex Goals and found that it solves early stopping, not long-run drift. The real long-running agent stack needs upfront clarification, multi-agent review, and memory outside the context window.
OpenAI Open-Sources Symphony: When Codex Workflow's Bottleneck Shifts From 'Writing Code' To 'Context Switching'
OpenAI open-sources Symphony — a spec that turns Linear's issue board into the control plane for Codex agents. Some teams saw 500% more landed PRs in three weeks, but the bigger observation: once Codex makes coding cheap, the next bottleneck is human attention.
OpenAI Open-Sources Euphony: A Mirror for Codex, Plus a Masterclass in 2-Line AGENTS.md
OpenAI quietly open-sourced Euphony — a browser-based viewer for Harmony chats and Codex session logs (Apache 2.0). Four telling details buried in the source: a 2-line AGENTS.md, gpt-tokenizer as a runtime dep, translation needing the user's own API key, and a self-written SSRF warning.
One `message Romain` prompt runs the whole workflow — OpenAI DevX demos Codex Chronicle, but the costs the tweet skipped matter too
OpenAI DevX's Dominik Kundel says: now that Codex has memories, plugins, and the newly-dropped Chronicle, he no longer packages context for AI — one line 'sync docs + message Romain' reads a Google Doc, edits markdown, opens a PR, and DMs the right person on Slack. Very nice. But the three costs written into official Chronicle docs were not in the tweet: macOS screen-recording permission, memories stored unencrypted on device, prompt injection risk amplified. Chronicle is a screen-recording agent, not a harmless booster.
Nick Baumann: The Best Tools for Codex Are Bespoke CLIs
Nick Baumann isn't chasing MCP or the next protocol. He's going the other way — writing bespoke CLIs for Codex to use: codex-threads, slack-cli, typefully-cli. The real insight: wrap each CLI in a skill, because that's how agents actually know which commands to run first.
Why Programmers Love Codex While Vibe Coders Can't Quit Claude: Dense vs MoE Is Really a Story About Two Coding Philosophies
Berryxia uses Dense vs MoE to explain something many developers already feel: Codex often shines in bug fixing, refactors, and long-running engineering tasks, while Claude keeps winning over vibe coders. That framing captures part of the truth, but the real split is bigger than architecture — it includes training philosophy, product design, and whether you treat coding as precise delegation or interactive creation.
Stop Managing Agents, Start Managing Work: Symphony's Open-Source Workflow
@daniel_mac8 shares an open-source Elixir implementation: create a Linear issue and move it to 'in progress,' and Symphony picks it up in a dedicated Codex workspace. Codex even writes status updates back. The author argues this is software development moving up an abstraction layer.
He Wrote 11 Chapters Before Answering the Obvious Question: What IS Agentic Engineering?
Simon Willison's Agentic Engineering Patterns guide now has 12 chapters — but this new one goes at the very beginning. He finally answers 'What is Agentic Engineering?' The answer is surprisingly simple: using coding agents to help build software. The interesting part is why it took 11 chapters of hands-on patterns before he felt ready to define it.
Treat Codex Like a Teammate, Not a Tool: 10 Best Practices That Actually Work
A guide to Codex best practices from prompting and planning to MCP, Skills, and Automations — building a more reliable agent workflow.
Command an AI Army from Your Chat App — OpenClaw ACP Lets You Run Codex, Claude Code, and Gemini from Discord / Telegram
OpenClaw's ACP lets you spawn Codex, Claude Code, and Gemini from Discord/Telegram chat. Now with Telegram topic binding, persistent bindings that survive restarts, ACP Provenance for audit trails, and more. (Updated 2026-03-09)
Reverse-Engineering Codex: Cracking Open the Context Compaction API with Prompt Injection
Developer Kangwook Lee used just 2 API calls and 35 lines of Python to crack open Codex's hidden context compaction API via prompt injection — revealing the secret system prompts behind the encryption.
Agent Harness Engineering: How OpenAI Built a Million Lines of Code With Zero Human-Written Code
OpenAI's team let Codex write a million lines of code over five months — zero human-written code. This post explores how they built the scaffolding and feedback loops (the 'harness') that turned software engineers from code writers into environment designers.
Karpathy Built an 8-Agent AI Research Team — They Can't Actually Do Research
Karpathy spent a weekend running 4 Claude + 4 Codex agents as an ML research team on GPUs. The result: agents are S-tier at implementation but F-tier at experiment design. His key insight — 'You are now programming an organization' — might define agentic engineering in 2026.
One Person = One Dev Team: The Complete Setup for Commanding a Codex/Claude Code Army with OpenClaw
Indie hacker Elvis Sun shared his complete workflow using an OpenClaw agent (Zoe) as an orchestrator to automatically spawn Codex and Claude Code agents. 50 commits per day on average, 7 PRs in 30 minutes, three layers of AI code review, and Zoe proactively scans Sentry to fix bugs. Cost: $190/month.
Code Got Cheap — Now What? Simon Willison's Agentic Engineering Survival Guide
Simon Willison launched a new series called Agentic Engineering Patterns — a playbook for working with coding agents like Claude Code and Codex. Lesson one: writing code got cheap, but writing good code is still expensive. Lesson two: 'red/green TDD' is the most powerful six-word spell for agent collaboration.
OpenClaw Creator Runs 50 Codex Agents for PR Triage: Handling 3,000+ Changes Without a Vector DB
Peter Steinberger shared a high-scale PR triage workflow: run 50 Codex agents in parallel, generate structured JSON signals for each PR, then consolidate them in one session for dedupe/close/merge decisions. His key point: at this scale, you may not need a vector database first—clean structured reports plus large-context reasoning can be enough to ship faster.
33,000 Agent PRs Tell a Brutal Story: Codex Dominates, Copilot Struggles, and Your Monorepo Might Not Survive
Drexel/Missouri S&T analyzed 33,596 agent-authored GitHub PRs from 5 coding agents. Overall merge rate: 71%. Codex: 83%, Claude Code: 59%, Copilot: 43%. Rejection cause: no review. LeadDev warns PR flood is crushing monorepos/CI.
GitHub Agent HQ: Claude, Codex, and Copilot Now Fight Side by Side in the Same PR — The Multi-Agent Era Is Here
GitHub's Agent HQ now offers multi-agent support (Claude, Codex, Copilot) for Copilot Pro+ & Enterprise users. Run multiple AIs simultaneously in GitHub/VS Code to tackle problems from different angles. Outputs become Draft PRs. A paradigm shift for code review.
OpenAI's Agent Trinity: Skills + Shell + Compaction — A Field Guide
OpenAI released three primitives for long-running agents: Skills (reusable SKILL.md instruction packs), Shell (hosted container runtime), and Compaction (automatic context compression). Includes 10 battle-tested tips and Glean's production data.
OpenAI × Cerebras: Codex-Spark Codes 15x Faster — But What's the Catch?
OpenAI released GPT-5.3-Codex-Spark, its first model on Cerebras chips. It's incredibly fast (>1000 tokens/sec, 80% lower latency), but smaller, no auto-tests, Pro-only. This marks OpenAI's first production deployment on non-Nvidia hardware, redrawing the AI compute landscape.
Running Codex Inside Claude Code (The Elegant Way)
Hook up Codex as an MCP server inside Claude Code with a single command. Why fight Codex CLI's rough edges when you can plug its brain into a better body?
OpenAI Researcher Spends $10K/Month on Codex — Generates 700+ Hypotheses
Karel (OpenAI researcher) shares how he burns billions of Codex tokens: agents writing their own notes, crawling Slack, analyzing data, and generating 700+ hypotheses. He now talks to one agent that orchestrates everything else.
Inside OpenAI: How They're Going Agent-First (Straight From the Co-Founder)
OpenAI co-founder Greg Brockman publicly reveals how OpenAI is transforming to agentic software development internally. By March 31st, agents should become the first resort for all technical tasks. Includes six concrete recommendations, including 'Say no to slop' on code quality.
Claude Code vs Codex: Pick the Right Tool for the Job
Claude Code is a Templar — steady and reliable. Codex is a Glass Cannon Mage — explosive output but easy to blow up. Pick your quest, then pick your character.