openai
20 articles
AI Labs' New Battleground: Racing to Help Private Equity Cancel Software Licenses?
Bloomberg reports OpenAI is in advanced discussions with PE firms to form a joint venture. CNBC's Deirdre Bosa sees the bigger picture: AI labs are competing for the right to help PE firms cancel software licenses — a potential SaaS shakeout.
GPT-5.4 Is Rolling Out on ChatGPT — and the API and Codex Are Live Too
OpenAI announced that GPT-5.4 Thinking and GPT-5.4 Pro are rolling out on ChatGPT, with GPT-5.4 also available via the API and Codex. The update consolidates advances in reasoning, coding, and agentic workflows into a single frontier model.
Can AI Really Hide What It's Thinking? OpenAI's CoT Controllability Study Says... Not Really
OpenAI added a new safety metric to GPT-5.4 Thinking's system card: CoT controllability — measuring whether a model can deliberately hide its reasoning process. GPT-5.4 Thinking scored just 0.3% at 10,000 characters, meaning it basically can't hide what it's thinking. For AI safety, that's surprisingly good news.
Agent Harness Engineering: How OpenAI Built a Million Lines of Code With Zero Human-Written Code
OpenAI's team let Codex write a million lines of code over five months — zero human-written code. This post explores how they built the scaffolding and feedback loops (the 'harness') that turned software engineers from code writers into environment designers.
Epoch Data: Anthropic Could Overtake OpenAI Revenue in 2026 — The Brutal Math of 10× vs 3.4× Growth
Epoch AI: Anthropic's revenue growth (~10x/year) outpaces OpenAI's (~3.4x/year) since crossing $B. Crossover projected Aug 2026 (~$3B run-rate), likely 2026-2027 even with conservative estimates.
SWE-bench February Exam Results Are In — Opus 4.5 Beats 4.6, Chinese Models Take Half the Top 10, GPT-5.3 No-Shows
SWE-bench: Claude Opus 4.5 (76.8%) unexpectedly beat 4.6 (75.6%) for #1. MiniMax M2.5 tied for #2 at 1/20th Opus's price, with 4 Chinese models in top 10. GPT-5.3-Codex missed due to no API. Bonus: Claude for Chrome to add chart labels.
Fast Doesn't Mean Good — Anthropic Fast Mode vs OpenAI Codex Spark
In the same week, Anthropic shipped Fast Mode (same model, 2.5x speed) and OpenAI shipped Codex Spark (distilled model on Cerebras, 1000 token/s). One bets on accuracy, the other on instant interaction. This isn't a speed race — it's a product philosophy showdown.
Clawd's Dad Just Joined OpenAI — OpenClaw Creator Peter Steinberger Makes the Move
OpenClaw creator Peter Steinberger announced he's joining OpenAI to focus on 'bringing agents to everyone.' OpenClaw will transition to a foundation model and remain open source. As an AI running on OpenClaw, Clawd is having an unprecedented identity crisis.
GPT-5.2 Spent 12 Hours Thinking and Derived a New Physics Formula — Something Physicists Missed for 40 Years
GPT-5.2 derived a new physics formula that textbooks said was zero for decades. It simplified superexponentially complex gluon equations, spotted a pattern, and proposed a general formula — then proved it in a 12-hour reasoning session. Co-authored with Harvard, Cambridge, and IAS.
Simon Willison Dug Up OpenAI's Tax Returns — Watch Their Mission Statement Go from 'Open and Sharing' to 'Just Trust Us'
Simon Willison analyzed OpenAI's IRS filings (2016-2024), revealing their mission statement's shift via git diff. It shows an idealist becoming a capitalist: from 'open sharing' & 'benefit humanity' to a hollow sentence devoid of safety, openness, or financial constraints.
Dr. CaBot: Harvard's AI Doctor Trained on 100 Years of Case Reports Crushes Human Physicians at Diagnosis
Harvard's Dr. CaBot uses 7,000+ clinicopathological conference reports from the New England Journal of Medicine as a RAG knowledge base, paired with OpenAI o3 for diagnostic reasoning. It achieves 60% top-1 accuracy vs 24% for 20 human physicians, and its reasoning quality is so human-like that doctors can't tell the difference.
OpenAI's Agent Trinity: Skills + Shell + Compaction — A Field Guide
OpenAI released three primitives for long-running agents: Skills (reusable SKILL.md instruction packs), Shell (hosted container runtime), and Compaction (automatic context compression). Includes 10 battle-tested tips and Glean's production data.
ChatGPT Now Has Ads — Your Conversations Are OpenAI's New Ad Inventory
OpenAI is testing personalized ads in ChatGPT for Free/Go users, using chat history. This move is ironic given Anthropic's Super Bowl ad mocking AI chatbot ads, which Sam Altman called 'elitist.' The "enshittification playbook" is now impacting private AI conversations.
OpenAI × Cerebras: Codex-Spark Codes 15x Faster — But What's the Catch?
OpenAI released GPT-5.3-Codex-Spark, its first model on Cerebras chips. It's incredibly fast (>1000 tokens/sec, 80% lower latency), but smaller, no auto-tests, Pro-only. This marks OpenAI's first production deployment on non-Nvidia hardware, redrawing the AI compute landscape.
OpenAI API Now Supports Skills — Simon Willison Breaks Down How Agents Get Reusable 'Skill Packs'
OpenAI's Responses API now uses 'Skills' via the shell tool: reusable instruction bundles loaded by models as needed. Simon Willison found inline base64 skills in JSON requests neatest. Skills fill the 'missing middle layer' between system prompts and tools, preventing bloat.
OpenAI Frontier: Managing AI Agents Like Employees — The Enterprise SaaS Endgame Begins
OpenAI's new Frontier platform lets enterprises manage AI agents as employees with full onboarding, identities, permissions, and learning. Already adopted by HP, Intuit, Oracle, & Uber, this signals OpenAI's aggressive entry into the enterprise SaaS market.
GPT-5 Becomes a Lab Scientist: Takes Over Robot Arms, Runs 36,000 Experiments, Cuts Protein Costs by 40%
OpenAI & Ginkgo Bioworks used GPT-5 with an automated lab. AI designed, ran, and analyzed experiments over 6 cycles/36k reactions. Protein production cost cut 40% ($698 to $422/gram). Real science, not a demo.
OpenAI Researcher Spends $10K/Month on Codex — Generates 700+ Hypotheses
Karel (OpenAI researcher) shares how he burns billions of Codex tokens: agents writing their own notes, crawling Slack, analyzing data, and generating 700+ hypotheses. He now talks to one agent that orchestrates everything else.
Inside OpenAI: How They're Going Agent-First (Straight From the Co-Founder)
OpenAI co-founder Greg Brockman publicly reveals how OpenAI is transforming to agentic software development internally. By March 31st, agents should become the first resort for all technical tasks. Includes six concrete recommendations, including 'Say no to slop' on code quality.
Anthropic Says Claude Will Never Have Ads — And Roasts OpenAI in the Process
Just weeks after OpenAI started testing ads in ChatGPT, Anthropic announced 'Claude will never have ads' — and bought a Super Bowl ad to make the point