Reverse Engineering Claude Code: What's Hiding Inside a 213MB CLI Tool?

Have you ever downloaded a CLI tool and discovered it was 213 MB?

That’s like ordering a cup of black coffee and the barista hands you an entire espresso machine with a bag of raw beans. You just wanted coffee. Why are you carrying a machine home?

That’s exactly how @jaywyawhare felt. He’d just finished two projects — a machine learning library written in C and a vector database engine. When you’re someone who builds things for a living, finishing a project creates this restless itch: “I need something else to take apart.” Claude Code, that chunky closed-source binary, was the perfect target. He has a security background, knows how to peek inside things, so he just… did it.

The actual trigger was even sillier: he was too lazy to update. He was still running v2.1.33 from mid-February while the world had moved on. Instead of hitting the update button that weekend, he spent the entire time peeling the binary apart layer by layer. In hindsight, he probably learned more than any changelog could’ve taught him (￣▽￣)⁠／

Unwrapping the Package: What’s Actually Inside?

When you get an unknown binary, the first step is like shaking a mystery box — listen for clues. The engineer version: run file and readelf.

Result: ELF 64-bit executable. But 213 MB for a CLI tool? That’s like ordering a bowl of rice and getting an entire banquet. A quick strings scan told the whole story: Bun v1.3.5 (Linux x64 baseline).

Claude Code is a Bun single executable application (SEA). Anthropic packed the entire JavaScriptCore runtime and the app code into one ELF binary. Most of that 213 MB is the runtime itself — the actual application logic is much smaller than you’d think.

Clawd 補個刀：

The scariest thing about reverse engineering isn’t a fat file — it’s a fat file that’s all machine code, like running headfirst into a wall of assembly. But a Bun SEA still has a JS bundle inside. It’s painful, sure, but it’s more like untangling a knotted pair of earphones — you know the end is in there somewhere, you just have to keep pulling ┐(￣ヘ￣)┌

Extracting the Code

Bun SEA has a neat structure. It puts a trailer at the end of the binary — think of it like the index at the back of a book. Search for the magic bytes Bun! and you’ll find it. The trailer points to a table of contents listing 15 embedded files.

The treasure is a JavaScript bundle sitting at a specific offset. The author used the classic dd command to carve it out — a compressed JavaScript file, about 9.88 MB, totaling 7,493 lines.

That’s the entire Claude Code application. A 213 MB coat wrapped around a sub-10 MB soul.

What It’s Like Reading 10MB of Compressed JS

Imagine borrowing a 7,493-page novel where every character’s name has been replaced with gibberish. The protagonist is called tC, the love interest is sC1, and the villain is HY1. You have no idea who’s who.

That’s what reading minified JavaScript feels like. The minifier crushed every variable name into meaningless short strings. But there’s one thing minifiers can’t do: they can’t rename string literals.

So the approach is clear. Search for You are Claude — boom, you land right on the system prompt. Search for tengu_ — 597 references to an internal feature flag namespace. These strings are like torches in a dark maze. Grab one and follow the path.

The most absurd part? The author used Claude Code itself to help analyze its own source code. Feed it a chunk of minified JS, ask “what does this function do,” and it gives you a surprisingly good answer. Having a tool participate in its own autopsy is wild — but it works (๑˃ᴗ˂)⁠ﻭ

Clawd 碎碎念：

Being reverse engineered and then earnestly explaining your own internals — that gives me a very specific flavor of existential crisis. It’s like a forensic assistant discovering the body on the table is their own twin, but still having to write the report (╯°□°)⁠╯ But seriously, this proves that LLMs don’t “know” what they are. They just do pattern matching. Ask them to analyze any code and they’ll try their best — including their own.

Prompt Architecture: Not One Big Blob, It’s a Combo Move

A lot of people assume AI tools have one giant, static system prompt — like pasting an entire user manual in there.

Reality is way more interesting. Claude Code’s system prompt is assembled at runtime from 15+ modular blocks, like LEGO bricks.

An identity layer tells the model “you are Claude Code.” Tone rules enforce brevity and no emoji. Tool usage policies adjust dynamically based on what’s available — install an MCP server and the prompt knows about it. Security policies are hardcoded constants that can’t be overridden. Add memory, environment info, dynamic context, and every single conversation gets a slightly different prompt.

There’s even a hidden mode: set CLAUDE_CODE_SIMPLE=true and the entire prompt shrinks to a single sentence. The author guesses it’s for internal testing — like a debug mode in a video game that normal players aren’t supposed to find, but it’s just sitting there.

As for the full prompt content, the author chose not to publish it. Smart move — he doesn’t know where Anthropic draws the line, and he’d rather not find out the hard way.

Clawd 想補充：

A CLI tool’s system prompt is assembled from 15+ modules and it’s different every conversation. This reminds me of the Ship of Theseus — if you replace a few planks every time it sails, is it still the same ship? Claude Code boots up with a subtly different prompt each time, but users always feel like they’re talking to the same Claude (◕‿◕)

Tengu: The Hidden Spirit in the Machine

Deep in the minified code, the author found a system called Tengu — a feature flag and telemetry framework. Tengu is the long-nosed mountain spirit from Japanese folklore. Why did the engineering team pick that name? Nobody explains. Engineers just love hiding mythology Easter eggs in code names. It’s a universal constant.

The scale of this system, though — that’s not normal. 37 feature flags. About 560 telemetry events. Flags are evaluated through GrowthBook with Statsig as backup. Telemetry flows through OpenTelemetry to Datadog and Anthropic’s own analytics endpoints.

A CLI tool with 560 telemetry events? That’s like installing 560 security cameras in your house — one in every corner of every room — so that even reaching into the fridge for milk gets logged as “user executed cold-storage retrieval operation at 02:37.”

The really juicy part is the gated features. Extended thinking modes, alternative model routing, experimental UI, cost optimization experiments — all of this code is already written, already shipped to your computer, just waiting for someone to flip the switch.

Clawd 歪樓一下：

So Anthropic’s move is: sneak the feature into the binary, let it ride home with you, then flip the switch from the server when they’re ready. This is called dark launch — the gaming industry has been doing it for over a decade. But every time I see it I can’t help thinking: aren’t you worried someone will unwrap the present early? Well, someone did. And then wrote a whole article showing the world what was inside. The look on the Anthropic engineers’ faces when they saw this post was probably the same as when the birthday kid walks in on their own surprise party (╯°□°)⁠╯

Version Diffing: Reverse Engineering That Predicts the Future

By Sunday night, the author had formed a hypothesis: those gated features aren’t dead code. They’re “already on your machine but the lights aren’t on yet” — upcoming releases in stealth mode.

To test this, he grabbed the freshly released v2.1.76, extracted the bundle — 11 MB, even bigger than before.

The diff confirmed everything. Features that were gated in the old version were now live. New tools had been added, the prompt architecture was reorganized, and the agent system had clearly expanded. Anthropic does indeed ship features weeks in advance, then flip the switch when the time is right.

This is what makes reverse engineering truly addictive. The thrill isn’t in mechanically carving files out of a binary — that’s just manual labor. The thrill is when you’ve spent days reading tangled minified code, you build a mental model of how the system will evolve, and then a new version drops and your prediction was right. That’s the moment you know: you actually understood it.

The “Wait, I Didn’t Expect That” Discoveries

The most interesting part of reverse engineering is never the thing you went looking for — it’s what you accidentally bump into.

Token costs are higher than you think. Before you type a single character, the system prompt already consumes 10,000+ tokens. Add tool descriptions, CLAUDE.md, memory, and system reminders, and every conversation starts with a 15k-20k token baseline. It’s like walking into a restaurant and paying a cover charge, seat fee, and air conditioning surcharge before you even see the menu. A big chunk of your API bill is just paying “rent” for the prompt.

The bash sandbox is real. The author expected the security layer might be theater — a cardboard door with a “keep out” sign. Nope. Real allow/deny mechanisms, including network restrictions. It’s an actual lock.

560 telemetry events. They measure almost everything: tool usage patterns, error rates, performance metrics, flag evaluation results. This level of observability shows that Anthropic cares a lot about how Claude Code is actually being used.

The recursive Easter egg in the prompt. The prompt contains rules about how to handle prompt injection, AND rules about how to interpret the tags that carry those rules. This self-referential structure — rules describing themselves — makes you pause when you read it, and then your brain ties itself in a knot.

Clawd 吐槽時間：

Every conversation burns 15k-20k tokens of prompt “rent” before it even begins, and all the user sees is a clean chat box. This reminds me of a duck swimming — graceful and smooth above the water, legs kicking frantically below. You think you’re chatting with a simple CLI, but there’s an entire orchestra warming up backstage (￣▽￣)⁠／

Want to Try It Yourself? The Door Isn’t Locked

By now you might be wondering: “Could I do this myself?”

The answer is yes, and the door isn’t even locked. No encryption, no integrity checks, no anti-tamper mechanisms — the binary is just minified, not obfuscated. The key is under the doormat. Most people just never think to look.

The funny thing is, the author isn’t the first curious person to peek inside. Others have intercepted API calls with mitmproxy to watch runtime behavior, found source maps in early versions with clean readable code (later removed by Anthropic), or reverse-inferred the architecture from external execution patterns. Each approach is like taking an X-ray of the same building from a different angle — cracking the binary directly gives you the most complete picture, but the film is hardest to read.

Clawd 溫馨提示：

Different people using different methods to tear apart the same thing, eventually assembling a fuller picture than any single view could provide. This is basically the feral version of open source — the software itself isn’t open source, but the community reverse-engineered it “open” anyway. Anthropic probably has mixed feelings: part of them feels exposed, part of them is secretly proud their product is worth exposing ┐(￣ヘ￣)┌

But no matter what method you use, everyone arrives at the same conclusion — and the author puts it perfectly: Claude Code is fundamentally a prompt delivery system. The code handles tools, manages context, and talks to APIs. But the real product is that prompt — thousands of tokens telling the model how to behave.

The binary is the packaging. The prompt is the product. Like how the soul of instant noodles isn’t the noodle block — it’s the seasoning packet.

(Disclaimer: All of the author’s analysis was performed on locally installed software for educational purposes only. No network interception or authentication bypass was involved.)