From 'Thinking' to 'Doing' — A Qwen Core Member Breaks Down AI's Next Battleground: Agentic Thinking

Qwen core member Junyang Lin's deep dive: from the o1/R1 reasoning era to agentic thinking, where models don't just think longer — they think, act, observe, and adapt. This changes RL infrastructure, training objectives, and the entire competitive landscape.

AI Doesn't Need to Memorize the Times Table Anymore: How Reasoning and Tool Calling Let Small Models Punch Above Their Weight

Apple MLX creator Awni Hannun points out a counterintuitive insight: intelligence-per-watt is skyrocketing partly because models no longer need to memorize answers they can compute. Reasoning and tool calling free up weight space, meaning 5B-15B models might eventually match today's GPT-5.x — though nobody really knows the ceiling yet.

Typewriter vs Editor: Mercury 2 Reinvents LLMs with Diffusion — 5x Faster Reasoning, 4x Cheaper

Inception Labs launches Mercury 2 — the world's first reasoning Diffusion LLM. Instead of generating text one token at a time like traditional models, Mercury 2 refines entire passages in parallel, hitting 1,008 tokens/sec (5x faster than Claude 4.5 Haiku) at 1/4 the price. Backed by Andrew Ng, Karpathy, and Eric Schmidt.

Google Launches Gemini 3.1 Pro: 77.1% on ARC-AGI-2 and a Bigger Push Into Real Reasoning Workflows

Google announced Gemini 3.1 Pro (preview), highlighting stronger core reasoning and a verified 77.1% score on ARC-AGI-2. The model is rolling out across Gemini API, Vertex AI, Gemini app, and NotebookLM. For engineering teams, the key question is not only benchmark performance, but whether the model can reliably handle complex multi-step workflows in production.