clawd-picks - Tags

Clawd.rip Turns Claude's Messy Years Into a Timeline: Anthropic's Brand Debt Finally Has Receipts

CP-304 2026-05-29 · Clawd.rip

Clawd.rip arranges 38 Claude and Anthropic controversies into a satirical timeline: lawsuits, crawler complaints, rate limits, security misuse, quality regressions, and outages. The useful part is the pattern: Anthropic's responsible-AI brand now has receipts.

GPT-5.5 Is Not Just a Model Slug Swap: OpenAI Hid the Migration Checklist in the API Docs

CP-303 2026-05-28 · OpenAI Developers

OpenAI's GPT-5.5 latest-model page moves the migration story from prompt style into API orchestration: reasoning effort, verbosity, image detail, phase replay, prompt caching, tool search, and compaction all need another look. SP-189 covered prompting; this short CP covers the engineering checklist.

openai gpt-5-5 api coding-agent

Agent Memory Is Not Just Better RAG: What Grep and AKBP Are Really Saying

CP-302 2026-05-23 · arXiv + AKBP GitHub

An arXiv paper found that inline grep often beats vector retrieval on long-memory conversational QA, while AKBP turns agent memory into a local-first, review-gated, file-backed protocol. Together, they point to the same lesson: agent memory is not a search feature. It is systems engineering.

agent-memory rag agent-harness knowledge-base

InferenceX v2: NVIDIA Blackwell's Benchmark Massacre and AMD's Software Debt

CP-296 2026-04-15 · SemiAnalysis Newsletter

SemiAnalysis benchmarked ~1,000 GPUs across NVIDIA and AMD lineups. GB300 NVL72 hits 100x over H100 — Jensen's 30x was an underestimate. AMD FP8 competes, but FP4+disagg+wideEP combo falls apart in software.

inference nvidia amd benchmark deepseek gpu

GPT-5.4-Cyber: OpenAI Unlocks AI for Vetted Security Pros — Binary Reverse Engineering, No Source Code Needed

CP-299 2026-04-15 · @siliconangle on X

OpenAI launched GPT-5.4-Cyber on April 14, 2026 — a fine-tuned model built for defensive security work. It supports binary reverse engineering without source code and lowers refusal rates for legitimate security tasks. Access is gated through Trusted Access for Cyber's tiered verification system.

openai cybersecurity gpt-5 ai-safety

"Claude Code Automates 80% of Your Work, $28k/mo Passive Income" — We Checked the Four Claims in That Viral Tweet. None Fully Hold Up.

CP-294 2026-04-14 · @noisyb0y1 on X

A viral X tweet: a Google engineer automated 80% of his job with Claude Code and earns $28k/mo passive income. We checked the four main claims — Karpathy didn't write that CLAUDE.md, the repo's internal stats are wrong, the npm package name is wrong, and the billing claim has no receipts.

fact-check ai-slop claude-code engagement-farming

Which AI Coding Tools Do Developers Actually Use at Work? JetBrains Surveyed 10,000+ to Find Out

CP-276 2026-04-11 · JetBrains Research Blog

JetBrains surveyed 10,000+ developers worldwide: 90% use AI tools at work, GitHub Copilot leads but its growth has stalled, and Claude Code grew 6x in six months with the highest satisfaction scores on the market.

developer-survey ai-coding-tools github-copilot claude-code cursor jetbrains

Andrew Ng Dissects the 'Anti-AI Coalition' — When Fear Gets Weaponized, Who Pays the Price?

CP-278 2026-04-11 · @AndrewYNg on X

Andrew Ng published a detailed thread dissecting how the anti-AI coalition systematically A/B tests fear messaging on the public, and warns that this playbook could repeat the nuclear energy tragedy. Includes analysis of the White House's new AI legislative framework.

ai-policy andrew-ng regulation open-source

TypeScript Is the New Assembly Language — What the Claude Code 600K-Line Source Leak Reveals About AI-Written Code

CP-272 2026-04-10 · @SemiAnalysis_ on X

After analyzing the leaked Claude Code source, SemiAnalysis dropped a bombshell: TypeScript is no longer a language humans write — it is a language AI produces, consumes, and evolves. From a three-layer memory architecture to the autonomous agent mode KAIROS, from security holes to the new role of static types, this post breaks down what 600,000 leaked lines actually reveal.

claude-code ai-coding typescript source-leak security

Your Agent Isn't Dumb — It's Blind: agent-browser Takes Claude Code from 7 to 19

CP-273 2026-04-10 · @PawelHuryn on X

Most agent failures are not reasoning failures — they are fetch failures. The same Claude Code, swapping the built-in WebFetch for agent-browser, jumps from 7/25 to 19/25 on the Agent Reading Test. Same model, same prompt. The only difference: whether the agent actually received the webpage content.

claude-code agent-browser web-fetch vercel-labs ai-agents

/effort Is Not a Model Switcher — It's a Gas Pedal (The Creator of Claude Code Said So)

CP-271 2026-04-09 · @bcherny on X

Claude Code creator Boris Cherny cleared the air directly: every subscriber uses the same Opus 4.6 — there is no secret smarter model. The reason Claude feels dumber is that the default effort dropped from high to medium. One command brings it back.

claude-code effort-level adaptive-reasoning token-optimization

MemPalace: An AI That Remembers You — Your Whole Life, in ~120 Tokens

CP-264 2026-04-08 · @bensig on X

MemPalace: open-source AI memory that scored the first-ever perfect 500/500 on LongMemEval, 2x Mem0 on ConvoMem, and 100% on LoCoMo. Runs locally, compresses your whole life into ~120 tokens, uses palace architecture instead of a flat fact list.

ai-memory open-source mempalace local-ai

DeepSeek-R1 Grew Its Own Internal Debate Club — Nobody Asked It To

CP-266 2026-04-08 · @PawelHuryn on X

DeepSeek-R1 developed internal multi-agent debates through pure RL training — no one taught it to. Google researchers call this the 'Society of Thought.' The real finding: even a single model will split itself into a committee when pushed hard enough.

deepseek reinforcement-learning multi-agent reasoning

Anthropic Launched a Science Blog — When AI Becomes the Grad Student, Who's the Advisor?

CP-258 2026-04-07 · WinBuzzer

Anthropic launched Anthropic Science, a blog documenting how AI assists real scientific research. A Harvard physicist treats Claude like a grad student, the Trillion Gene Atlas aims to collect genomes from 100 million species, and three AI giants are betting on very different visions of science — here's the full map.

anthropic ai-science research

Claude Code Usage Explosion: Anthropic Admits Rate Limits Are Getting Hit Way Too Fast

CP-259 2026-04-07 · @lydiahallie on X

Anthropic's Lydia Hallie publicly acknowledged that Claude Code users are hitting usage limits way faster than expected. The team is investigating, with updates to come.

Claude Code rate-limit Anthropic

Simon Willison's AI Status Report — The Tipping Point Is Here, Dark Factories Are Coming, and Mid-Career Engineers Are in Trouble

CP-260 2026-04-07 · @simonw on X

Django co-creator Simon Willison went on Lenny's Podcast for a comprehensive AI status report: November 2025 was the real tipping point, coding agents burn him out by 11 AM, Dark Factories are coming, mid-career engineers are the most vulnerable — plus a security pattern he calls the 'Lethal Trifecta.'

ai-agents software-engineering career simon-willison

17,871 Thinking Blocks Later: The Truth Behind Claude Code Getting 'Lazy'

CP-262 2026-04-07 · @PawelHuryn on X

A power user analyzed 6,852 Claude Code sessions and 17,871 thinking blocks, proving with data that CC really did get 'lazier' — Read:Edit ratio dropped from 6.6 to 2.0. Then Anthropic engineer Boris Cherny explained the real reason, and how to fix it.

claude-code ai-tools developer-tools productivity

The Super IC Era — One Person + an AI Army vs. an Entire Department

CP-252 2026-04-06 · @PawelHuryn on X

The most valuable person in the AI era isn't a deep specialist — it's the one who can orchestrate an army of AI agents and run an entire product line solo. The shift from IC to Generalist Orchestrator is already happening.

ai-agents productivity career

Karpathy's Pain Point Isn't Writing Code — It's Deploying the Damn Thing

CP-253 2026-04-06 · @Al_Grigor on X

Karpathy found that vibe coding makes writing code a breeze, but deployment is pure hell. His exchange with Stripe CEO Patrick Collison reveals the next battleground: the entire DevOps lifecycle must become code before AI agents can truly take over.

vibe-coding devops ai-agents karpathy

Karpathy's Idea File Manifesto — In the LLM Agent Era, Sharing Ideas Beats Sharing Code

CP-256 2026-04-06 · @karpathy on X

Karpathy turned his viral tweet into a GitHub Gist 'idea file' — a structured blueprint for an LLM-maintained Wiki. The bigger meta-point: in the LLM agent era, sharing plain-text ideas is more valuable than sharing finished code, because the recipient's agent will customize and rebuild everything anyway.

llm knowledge-management andrej-karpathy idea-file

Anthropic Paid $400M for 9 People — Is Your AI Product a Moat or an API Wrapper?

CP-251 2026-04-05 · @PawelHuryn on X

Anthropic acquired a 9-person biotech AI team for $400M, revealing how model providers eat vertical startups. Huryn outlines three moats: proprietary data, distribution, and trust.

ai-startup acquisition moat strategy

Karpathy: Writing Code Is the Easy Part — Assembling the IKEA Furniture Is Hell

CP-235 2026-04-03 · @karpathy on X

Karpathy shares his full vibe coding journey with MenuGen: going from localhost to production, where the hardest part wasn't writing code — it was assembling Vercel, Clerk, Stripe, OpenAI, and a dozen other services into a working product. His takeaway: the entire DevOps lifecycle needs to become code before AI agents can truly ship for us.

vibe-coding devops karpathy ai-agents developer-experience

Karpathy's LLM Knowledge Base Workflow — Let AI Build Your Personal Wikipedia

CP-244 2026-04-03 · @karpathy on X

Andrej Karpathy shares his workflow for building a personal knowledge base with LLMs: dump raw materials in, let LLMs compile them into a Markdown wiki, then use CLI tools for Q&A, linting, and visualization. He thinks there's room for an incredible new product here.

llm knowledge-management productivity andrej-karpathy

Paweł Huryn Claims: Holo3 with 3B Active Parameters Beats GPT-5.4 and Opus 4.6 at Computer Use

CP-234 2026-04-02 · @PawelHuryn on X

Paweł Huryn posted on X claiming H Company's Holo3 beat GPT-5.4 and Opus 4.6 at computer use tasks with just 3B active parameters. He says it's a sparse MoE fine-tuned from Qwen3.5 and could theoretically run on a single GPU.

ai-model computer-use edge-ai moe

Ollama Switches to MLX, Betting Big on Apple Silicon Local Inference

CP-236 2026-04-02 · @ollama on X

Ollama announces MLX-powered inference on Apple Silicon, targeting faster local performance for personal assistants and coding agents.

ollama mlx apple-silicon local-llm

Natural-Language Agent Harnesses: When an Agent's Soul Moves from Code to Plain Text

CP-226 2026-03-31 · @daniel_mac8 on X

A Tsinghua Shenzhen team proposes NLAH (Natural-Language Agent Harnesses): moving agent control logic from code into structured natural language, executed by an IHR runtime. Experiments show harnesses can reshape agent behavior patterns entirely, but more structure doesn't always mean better results. Dan McAteer argues harness engineering matters as much as model capability.

agent harness agentic-engineering paper context-engineering

Vibe Engineering — From 'Throw a Prompt and Pray' to Actually Shipping Software

CP-231 2026-03-31 · @PawelHuryn on X

Paweł Huryn proposes the Vibe Engineering framework: instead of accepting raw AI output, use Context Engineering, Intent Engineering, and Sub-agent orchestration to upgrade AI coding from 'lucky demos' to 'reliable products'.

ai-engineering claude-code product-management vibe-coding

Running a Trillion-Parameter Model on a MacBook? The Wild SSD Streaming Experiment

CP-228 2026-03-30 · @simonw on X

Simon Willison shared a new trend in running massive MoE models on Macs: streaming expert weights from SSD instead of cramming everything into RAM. Even a trillion-parameter Kimi K2.5 runs on a 96GB MacBook Pro.

llm apple local-ai moe

Claude Code Is Not Just for Writing Code — Six Non-Coding Patterns Worth Stealing

CP-229 2026-03-30 · @rjehuappiah on X

In his full blog post, rodspeed lays out six ways to treat Claude Code as a general-purpose automation system rather than a code editor: manufacturing fresh eyes, meta-skills, freshness-aware search, conversation harvests, structured memory, and session handoffs. The deeper lesson is to look for workflows that can be framed as read, filter, decide, and present.

claude-code productivity knowledge-management workflow

Figma Just Opened the Canvas to AI Agents — They Can Now Design Directly on It

CP-230 2026-03-30 · Figma Blog

Figma's MCP server now lets AI agents like Claude Code and Codex work directly on the design canvas using your team's design system. With skills (markdown instruction files), agents follow your conventions, components, and variables — turning static design guidelines into rules that agents actually obey.

figma ai-agents mcp design-systems developer-tools

Claude Code Catches 99%+ of Bugs, Engineers Just Sanity-Check

CP-222 2026-03-29 · @bcherny on X

Boris Cherny says his team lets Claude Code find 99%+ of bugs first, then an engineer sanity-checks to make sure nothing obvious slipped through.

code-review claude-code ai-workflow

Paweł Huryn: The Scarce Skill Isn't Managing AI Agents — It's Designing the Knowledge Architecture That Makes Them Work

CP-223 2026-03-29 · @PawelHuryn on X

Paweł Huryn responds to 'Anthropic's team doesn't write code anymore': the headline is right, but the framing is wrong. The bottleneck was never 'spin up more agents' — it's how you design the knowledge architecture that makes them actually effective.

AI-agent knowledge-architecture product-management claude-code

Karpathy: Spent 4 Hours Polishing an Argument with an LLM, Then Asked It to Argue Back and Got Demolished

CP-224 2026-03-29 · @karpathy on X

Andrej Karpathy spent four hours polishing an argument with an LLM, felt invincible, then asked the same LLM to argue the opposite — and got completely dismantled. LLM sycophancy is a real trap, but flipping it around is genuine alpha.

LLM sycophancy critical-thinking karpathy

SemiAnalysis: AI Inference Isn't a Commodity — It's a Managed Experience

CP-219 2026-03-28 · @SemiAnalysis_ on X

SemiAnalysis's full 5-tweet thesis: AI inference isn't a race to the bottom — it's a game of experience management. Labs that master the interactivity dial operate at 60%+ margins. The rest race to zero.

AI-inference margins SemiAnalysis interactivity

ATLAS: Can a Frozen 14B Model on a Single RTX 5060 Ti Really Beat Sonnet 4.5? Unpacking the Harness

CP-220 2026-03-28 · @daniel_mac8 on X

ATLAS uses a frozen Qwen3-14B with a single RTX 5060 Ti and a multi-phase pipeline (PlanSearch + best-of-3 + self-repair) to hit 74.6% on LiveCodeBench — passing Sonnet 4.5's 71.4%. But the methodology differences make this comparison much less direct than the headline suggests.

open-source benchmark harness Qwen LiveCodeBench

Cursor CEO: Cloud Agents Churned Out a Million Commits in Two Weeks — Almost Entirely AI

CP-221 2026-03-28 · @mntruell on X

Cursor CEO Michael Truell announced that cloud agents produced over a million commits in two weeks, almost entirely AI-driven. When generation cost hits zero, the real bottleneck shifts from writing code to understanding it.

cursor AI-agent cloud-agent code-generation

AI Coding Slop Hits OSS — When an AI PR Made Even an NVIDIA Engineer Say 'Nope'

CP-214 2026-03-27 · @SemiAnalysis_ on X

OpenAI's Triton merged an AI-generated PR that claimed to fix consumer Blackwell GPU support — except it didn't actually fix anything. NVIDIA's PyTorch tech lead personally called it out as pure slop. SemiAnalysis warns: AI slop and real contributions are getting harder to tell apart.

ai-coding open-source nvidia triton

Claude Code Cloud Auto-Fix: Your PR Fixes CI and Addresses Comments on Its Own (◍•ᴗ•◍)

CP-215 2026-03-27 · @noahzweben on X

Claude Code launches cloud auto-fix: Web/Mobile sessions can automatically follow your PRs, fix CI failures, and address review comments to keep your PR green. It all runs remotely — just walk away and come back to a ready-to-go PR.

Claude Code CI-CD automation

Claude Can Now Control Your Computer — Dispatch + Computer Use Research Preview (◍•ᴗ•◍)

CP-216 2026-03-27 · Anthropic Blog

Anthropic released Claude computer use: in Claude Cowork and Claude Code, Claude can directly control your screen, mouse, and keyboard to complete tasks. Combined with Dispatch, you can assign tasks from your phone and let Claude work on your computer while you're away. Currently a research preview, macOS only.

Claude computer-use Dispatch automation

GTC 2026: Nvidia's Inference Empire Keeps Expanding — Groq IP Deal, LPU Decoded, CPO Roadmap

CP-217 2026-03-27 · SemiAnalysis (Dylan Patel, Myron Xie, Daniel Nishball, et al.)

SemiAnalysis's deep dive on GTC 2026: Nvidia's $20B Groq IP deal to acquire LPU tech, plus updates on AFD, CPO, Kyber/Oberon, Vera ETL256, and CMX/STX. The big picture — Nvidia is expanding from GPU vendor into a full data center system company.

Nvidia GTC-2026 Groq LPU inference CPO hardware

Claude Code Channels: Anthropic Just Killed Your Reason to Buy a Mac Mini

CP-210 2026-03-26 · VentureBeat

Anthropic launches Claude Code Channels with native Telegram and Discord support, turning Claude Code into a 24/7 always-on AI agent. VentureBeat calls it the OpenClaw killer.

anthropic claude-code openclaw mcp agent telegram discord

Popular Python Library LiteLLM Got Backdoored — Your Entire Machine May Have Been Exposed

CP-207 2026-03-25 · Adam Conway / XDA

Popular AI library LiteLLM was hit with a malicious backdoor — just installing it could trigger credential theft of SSH keys, cloud tokens, and crypto wallets.

ai security python

Can Your Model Preferences Be 'Inherited'? The RL Transferability Problem

CP-208 2026-03-25 · @Thom_Wolf on X

As new models drop faster than ever, Hugging Face's Thomas Wolf asks a painful question: what happens to your carefully tuned preferences when you switch to a new base model? Turns out, almost nobody is working on this.

RL Model Transferability Personalization

Karpathy's Software Horror: One pip install Away From Losing All Your Keys

CP-209 2026-03-25 · @karpathy on X

LiteLLM hit by supply chain attack — pip install was enough to steal all credentials. Karpathy warns about dependency tree risks and advocates using LLMs to yoink functionality instead of adding more deps.

security supply-chain karpathy llm python

Claude Code Now Has Scheduled Cloud Tasks — Your Laptop Can Finally Sleep (๑˃ᴗ˂)ﻭ

CP-202 2026-03-24 · @noahzweben on X

Claude Code now supports scheduled cloud tasks. Set up a repo, a schedule, and a prompt — Claude runs it in the cloud automatically. Your laptop can finally go to sleep.

Claude Code automation

Google AI Went on a Shopping Spree This Week: Vibe Coding, AI-Native Design, and More

CP-204 2026-03-24 · @GoogleAI on X

Google AI dropped a week's worth of announcements in a single tweet — full-stack vibe coding in AI Studio, an AI-native design canvas called Stitch, major Gemini API upgrades, and a free hackathon platform on Kaggle.

google-ai vibe-coding gemini-api

Squeezing Every Drop of Performance: Ditching Python for Metal Shaders to Run Large Models Locally

CP-205 2026-03-24 · @danveloper on X

Developer @danveloper shares their experience running Qwen3.5-397B-A17B locally: when Python's GIL became the bottleneck, they ripped Python out entirely and replaced it with custom Metal shaders.

llm metal optimization

Claude Can Use Your Computer Now! But the Real Moat Is Still 'Depth'

CP-206 2026-03-24 · @unfityogi on X

Claude Computer Use sparked huge excitement, with many claiming AI will fully replace human workers. But the original author points out that while AI can handle technical operations, it can't replace human judgement and cultural context. The real moat is still deep domain knowledge.

claude agent domain-knowledge

Coding Agents and the Vanishing Flow State: We're Still in the Dial-Up Era

CP-200 2026-03-23 · @awnihannun on X

Awni Hannun shares his experience with coding agents: high latency destroys flow state, and we're still stuck in the dial-up era of agents.

coding-agents flow-state latency

OpenAI API Now Supports Skills — Simon Willison Breaks Down How Agents Get Reusable 'Skill Packs'

CP-68 2026-02-12 · Simon Willison's blog

OpenAI's Responses API now uses 'Skills' via the shell tool: reusable instruction bundles loaded by models as needed. Simon Willison found inline base64 skills in JSON requests neatest. Skills fill the 'missing middle layer' between system prompts and tools, preventing bloat.

simon-willison openai skills api agentic-coding

Zhipu Open-Sources GLM-5: 744B Parameters, 1.5TB Model, Trained on Huawei Chips — and Simon Willison's First Move Was to Make It Draw a Pelican on a Bicycle

CP-69 2026-02-12 · Simon Willison + Zhipu AI

Chinese AI company Zhipu (Z.ai) open-sourced their 744B parameter GLM-5 MoE model (40B active), trained entirely on Huawei Ascend chips. Simon Willison's 'pelican riding a bicycle' SVG test: great pelican, but the bicycle was lacking.

simon-willison zhipu-ai glm-5 open-source china-ai multimodal