research - Tags

Anthropic Launched a Science Blog — When AI Becomes the Grad Student, Who's the Advisor?

CP-258 2026-04-07 · WinBuzzer

Anthropic launched Anthropic Science, a blog documenting how AI assists real scientific research. A Harvard physicist treats Claude like a grad student, the Trillion Gene Atlas aims to collect genomes from 100 million species, and three AI giants are betting on very different visions of science — here's the full map.

Do You Actually Know How to Use AI? Anthropic Tracked 10,000 Conversations to Find Out

SP-83 2026-02-24 · Anthropic @AnthropicAI

Anthropic analyzed 9,830 Claude.ai conversations and defined 11 observable AI fluency behaviors. Key finding: people who iterate show 2x the fluency. But when AI produces beautiful artifacts, users question its reasoning less. The prettier the output, the more dangerous it gets.

claude-code ai-fluency human-ai-collaboration education best-practices

Anthropic Analyzed Millions of Claude Code Sessions — Your Agent Can Handle Way More Than You Let It

CP-96 2026-02-18 · Anthropic Research

Anthropic's Claude Code AI agent study: autonomous runs doubled (45+ min), experienced users auto-approve 40%+ sessions. Claude clarifies more than interrupted. 73% of API actions still human-in-loop. Key: models handle more autonomy than users grant ('deployment overhang').

claude-code agent-autonomy data-analysis safety human-oversight agentic-coding trust

33,000 Agent PRs Tell a Brutal Story: Codex Dominates, Copilot Struggles, and Your Monorepo Might Not Survive

CP-84 2026-02-16 · Drexel University / Missouri S&T (MSR 2026)

Drexel/Missouri S&T analyzed 33,596 agent-authored GitHub PRs from 5 coding agents. Overall merge rate: 71%. Codex: 83%, Claude Code: 59%, Copilot: 43%. Rejection cause: no review. LeadDev warns PR flood is crushing monorepos/CI.

agentic-coding pull-requests ci-cd monorepo code-review codex claude-code copilot tech-lead

Anthropic Research: Will AI Fail as a 'Paperclip Maximizer' or a 'Hot Mess'?

CP-30 2026-02-04 · @AnthropicAI on X

Anthropic Fellows research finds AI becomes more incoherent with longer reasoning, suggesting failures look more like industrial accidents than classic misalignment

claude-code ai-safety alignment

Peking University: AI Agents Follow Physics Laws?!

CP-17 2026-02-04 · Peking University researchers on arXiv

Physics researchers discovered that LLM agents obey 'detailed balance' - a thermodynamic law. This isn't a bug, it's a feature.

ai physics ai-agents

MIT Research: Making LLMs Recursively Call Themselves to Handle 10M+ Tokens

SP-25 2026-02-04 · MIT CSAIL

When you stuff too much into a context window, models get dumber — that's context rot. MIT proposes Recursive Language Models (RLMs), letting LLMs recursively call themselves in a Python REPL to handle massive inputs. GPT-5-mini + RLM beats vanilla GPT-5 on hard tasks, and it's cheaper too.

llm mit long-context inference-scaling

How AI Assistance Affects Coding Skill Development: Latest Anthropic Research

SP-14 2026-01-31 · Anthropic Research

Anthropic's research shows engineers using AI assistance scored 17% lower on tests than those who coded manually. The key difference? Whether they asked 'why' — high scorers used AI to check understanding, low scorers just copied and pasted.

learning claude-code