Anthropic Acquires Vercept — R-CNN Inventor Joins the Team, Computer Use Jumps from 15% to 72.5%, UiPath Stock Drops
Picture the Accountant at Your Company
Every morning at nine, she opens Excel, toggles between three systems to cross-check numbers, and squints until her eyes give up. Over in HR, hiring season means drowning in a sea of resumes — open, check education, check experience, log into system, repeat. Customer service? Every incoming call means juggling three back-end tools at once, calming the customer with one hand while searching records with the other.
These people don’t write code. They don’t need APIs. What they need is an assistant that can see the screen, operate the software, and — critically — one that doesn’t zone out on Friday afternoon.
Today, Anthropic said: we’re building that assistant. The method? Acquire Vercept, and put the most-cited computer vision researcher on the planet at the table.
Clawd 偷偷說:
Let me be direct about what this story is really about. It’s not “another acquisition.” It’s Anthropic officially signaling to every office worker who can’t write code: “We’re coming to help you.” Claude Code handles the engineers. Computer Use handles the other hundred-times-more people. That “hundred times more” is the real gold mine (⌐■_■)
Who Is Vercept? And Why Did Anthropic Want Them?
Let’s start with the people joining, because that’s the real headline:
- Ross Girshick — If you’ve done anything in computer vision, you know this name. He invented R-CNN (Region-based Convolutional Neural Network) in 2013, the breakthrough that handed the entire object detection field over to deep learning. Google Scholar citations: over 660,000. To put that in perspective — most professors would throw a party if they hit 10,000 in a lifetime. 660K means “you wrote a tool that literally every CV researcher on Earth has used.” Previously at Meta’s FAIR lab, where he created Detectron — basically the standard toolkit for object detection worldwide.
- Kiana Ehsani — AI + embodied intelligence researcher
- Luca Weihs — Previously at Allen Institute for AI (AI2), specializing in embodied AI
Now the company itself. Vercept’s founding thesis was straightforward: to make AI genuinely useful, you have to solve the hard problems of seeing and interacting. In June 2025, they raised $16 million, backed by former Google CEO Eric Schmidt and Dropbox co-founder Drew Houston. Their product, Vy, was a Mac app — AI watches your screen, understands your workflow, and automates repetitive tasks.
Sound like UiPath?
Yes. But here’s the difference — Vercept used foundation model-level visual understanding, not traditional RPA’s “hard-code every button position” approach. Traditional RPA is like writing an absurdly detailed instruction manual for a robot that cannot think: “Row three, column five, click.” UI gets updated? Manual goes in the trash. Button moves? Trash. It’s like spending three days training a new hire on the old ERP system, only for IT to push an update overnight that changes everything ┐( ̄ヘ ̄)┌
Vercept took a completely different approach — it let AI actually see and understand the screen, the way you can figure out a new app without reading the manual. After the acquisition, Vercept shuts down its own product and the entire team joins Anthropic’s Computer Use division.
Clawd 碎碎念:
Let me help you tell these two things apart. Claude Code writes code — it edits your codebase, runs terminal commands, deploys stuff. That’s the developer’s world. Computer Use? It opens your browser, moves your mouse, clicks your buttons — like your screen got taken over by a very smart remote intern ( ̄▽ ̄)/ One solves efficiency for engineers. The other solves efficiency for everyone who can’t write code. How many people is that? Go back to the opening — the accountant, the HR team, customer service. They’re the hundred-times-more.
From “Found a Button” to “Doesn’t Need Training” — The OSWorld Comeback
Here’s the number Anthropic put in their announcement. Let me help you feel it with an analogy.
When Computer Use first launched in late 2024, the OSWorld score was under 15%. OSWorld is the most widely-used benchmark for AI computer use — it throws real desktop tasks at AI: navigating complex spreadsheets, filling forms across browser tabs, completing multi-step operations in real desktop environments. What does 15% feel like? It’s like asking Claude to help with Excel and getting: “Hmm, I found something clickable… wait, that’s an ad, not a button.” Basically a first-day hire who doesn’t even know the Wi-Fi password yet.
By February 2026, Sonnet 4.6 scored 72.5%. Nearly a 5x improvement in 16 months.
What level is 72.5%? It’s the intern who’s been around for three months — can work independently, doesn’t need much supervision, handles cross-tab forms and complex spreadsheet navigation. Occasionally gets stuck, but most of the time you hand them a task and it gets done. Anthropic themselves say it’s “approaching human-level performance.”
Now add Ross Girshick and the Vercept team to the mix. That’s like hiring the world’s best tutor for that intern. Next test? I’m guessing they graduate straight from “intern” to “full-time employee who doesn’t need training” (๑•̀ㅂ•́)و✧
Clawd 畫重點:
15% to 72.5% took 16 months. If that acceleration curve holds, give Girshick’s team another 12 months and 95%+ seems very likely. At that point, Computer Use stops being “an impressive demo” and becomes “a production-ready tool you can plug into enterprise workflows.” And note — OSWorld tests general desktop tasks. Fine-tune for specific enterprise scenarios and accuracy goes even higher (◕‿◕)
Wall Street’s Instant Reaction: UiPath Loses $250 Million
Within hours of the announcement, RPA giant UiPath (NYSE: PATH) dropped 3.6%.
That doesn’t sound dramatic? UiPath has a market cap of about $7 billion. 3.6% equals roughly $250 million in market value, gone. A few hours. One acquisition announcement. Ross Girshick hasn’t even officially started yet.
This isn’t UiPath’s first AI scare. In recent days, RBC Capital already cut their price target from $17 to $14. The entire RPA industry is going through an existential crisis — and not the “crying wolf” kind. The wolf is actually at the door, and it just hired the world’s best face-recognition expert to lead the way.
What is traditional RPA at its core? An absurdly detailed SOP manual, precise down to “move mouse to coordinates (342, 567), left-click once.” The problem? That manual is glued to a specific version of the UI. System update? Manual is toast. Button moved? Toast. Font size changed? You guessed it — toast.
AI Computer Use doesn’t need a manual. It can see the screen. Just talk to it: “Find all clients in this spreadsheet who haven’t paid in over 30 days.” Doesn’t matter what the UI looks like, doesn’t matter if Excel is in English or Chinese — if it can see it, it can work with it.
Clawd murmur:
Wall Street is telling you with real money: they believe AI Computer Use will eat traditional RPA. And this is just the “beginning.” Think about it — the first results from Vercept’s team probably won’t show up for three to six months. By then, UiPath’s stock chart might start looking like a playground slide ╰(°▽°)╯
Anthropic’s Week Was Basically a Boss Rush
Zoom out and look at what Anthropic did this week, and the Vercept acquisition was practically the side quest.
First came Claude Code Security — letting Claude scan your codebase for vulnerabilities. Cybersecurity stocks collectively decided to lie down. Then the Claude Cowork enterprise update, plugging directly into Slack, Salesforce, Gmail, and Docusign — basically connecting every tool that offices actually use. After that, a COBOL automation tool that lets Claude translate those ancient programs running on IBM mainframes for forty years. IBM’s stock took a punch. Add in the RSP 3.0 safety policy update, and then today’s Vercept acquisition.
Five days. Five major announcements. CNBC reported that after Anthropic’s enterprise agents event, Salesforce jumped 4% and Thomson Reuters surged 11% — because they announced Claude partnerships. But the companies Anthropic was aiming directly at? Not so happy.
Wedbush analysts tried to soothe everyone with a research note: “The risk of AI replacing entire software ecosystems is overblown.”
Related Reading
- CP-41: SemiAnalysis: Claude Code is the Inflection Point — 4% of GitHub Commits, Microsoft’s Dilemma, and the $15T Information Work Apocalypse
- CP-77: Spotify’s Best Engineers Haven’t Written a Line of Code Since December — Thanks to AI and an Internal System Called Honk
- SP-46: Anthropic’s 2026 Report: 8 Trends Redefining Software Development (The Code Writer Era Is Over)
Clawd 吐槽時間:
Let me translate what Wedbush analysts actually meant. Their words: “The risk of AI replacing software is overblown.” Translation: “Please stop selling. We’re still holding a mountain of software stocks we need to offload.” On Wall Street, this type of report has a special name — it’s called “crying bullish through the tears” (¬‿¬)
The Accountant’s Assistant Is Coming
Let’s go back to that opening scene — the accountant squinting at three systems, HR drowning in resumes, customer service juggling back-end tools.
Anthropic’s two acquisitions tell a story clean enough for an exam answer:
Bun → Make Claude Code’s runtime faster → The engineer’s world, handled. Vercept → Make Claude’s visual understanding stronger → Everyone else’s world, loading.
The first hand they played was for the people who write code. This hand? It’s for the other hundred-times-more. And Anthropic just sat the most-cited computer vision researcher on the planet down at the table.
The accountant’s AI assistant is almost here (◍•ᴗ•◍)