epoch-ai - Tags

Epoch AI Re-Ran SWE-bench Verified: Better Scores May Mean Better Evaluation Setup, Not Just Better Models

CP-109 2026-02-22 · Epoch AI

Epoch AI's SWE-bench Verified v2.x aligns model scores with developer reports. Key lesson: benchmark outcomes are heavily influenced by scaffold/tooling quality, environment reliability, and evaluation settings, not just base model capability.

Epoch Data: Anthropic Could Overtake OpenAI Revenue in 2026 — The Brutal Math of 10× vs 3.4× Growth

CP-101 2026-02-20 · Epoch AI

Epoch AI: Anthropic's revenue growth (~10x/year) outpaces OpenAI's (~3.4x/year) since crossing $B. Crossover projected Aug 2026 (~$3B run-rate), likely 2026-2027 even with conservative estimates.

claude-code openai revenue ai-industry business market

AI Inference Costs Drop 5-10x Every Year — Epoch AI Has the Receipts to Prove It

CP-89 2026-02-17 · Epoch AI Gradient Updates

Epoch AI researcher Jean-Stanislas Denain challenges Toby Ord's pessimism with data: AI capability cost drops 5-10x annually. A $1M task today could be $100K next year, $10K after. Inference cost is real but temporary.

inference-cost rl-scaling ai-industry distillation cost-reduction frontier-models

An Epoch AI Researcher Tested It: How Close Is AI to Taking My Job?

CP-43 2026-02-08 · Epoch AI Gradient Updates

Anson Ho tested AI on real job tasks (web apps, articles, content). AI excels on benchmarks but struggles with real work. Forecast: 2026 safe, 2028-2029 a turning point.

job-automation ai-benchmark productivity moravec-paradox