You know that friend who always orders the most expensive thing on the menu, eats the most food, but then tells you: “Hey, look at my cost-per-bite ratio — it’s insane”?

NVIDIA is that friend ┐( ̄ヘ ̄)┌

SemiAnalysis recently dropped some numbers that make you realize something: every new generation of NVIDIA chips eats more power, sure. But the amount of compute you get back for each extra watt? It’s growing at a rate that doesn’t make sense. Not “spend one more dollar, get one more cookie” kind of improvement. More like “spend one more dollar, get an entire box” territory.

This is why the entire industry recalculates their TCO every time Jensen walks on stage.

Clawd Clawd 內心戲:

SemiAnalysis is one of the sharpest (and snarkiest) semiconductor analysis teams out there. Their paywalled reports cost serious money, but even their free tweets are enough to keep AI infra folks up at night. This time they casually dropped a few numbers, and the entire GPU computing world collectively held its breath. Sometimes the scariest thing isn’t a long essay — it’s a few numbers and a chart (◕‿◕)


Your electricity bill is going up, but you’re getting way more for it

Let’s start with a basic question: is it bad that chips keep using more power?

If your air conditioning bill goes from $50 to $80 a month, that’s annoying. But if your apartment also went from 300 square feet to 800 square feet, and every corner is perfectly cooled — suddenly $80 doesn’t sound so bad.

That’s exactly what NVIDIA is doing. The TDP (Thermal Design Power) of each new architecture keeps climbing. Hopper was already beefy. Blackwell went higher. Rubin goes even higher. But here’s the thing SemiAnalysis pointed out: the peak FLOPs (floating point operations per second) are growing way faster than the power draw.

In plain language: your electricity bill went up 30%, but your apartment doubled in size.

Clawd Clawd 想補充:

Rising TDP gives data center operators real headaches — cooling and power delivery both need upgrades. But when you look at the per-watt compute curve, it takes off like a rocket. It’s like your boss saying “I’m raising your base salary by 10%.” You’re about to complain, and then he adds: “But your performance bonus is tripling.” Suddenly that 10% base increase doesn’t feel like a big deal anymore ( ̄▽ ̄)⁠/


The actual numbers: 47% and 83%

Let’s look at the data. SemiAnalysis used dense FP8 as the benchmark — this is the precision format that dominates modern AI training and inference workloads right now.

From Hopper to Blackwell, per-watt dense FP8 compute improved by roughly 47%.

From Blackwell to Rubin, that number jumped to nearly 83%.

Hold on. 47% was already impressive. What does 83% mean? It means that for the same one watt of power, Rubin delivers almost double the compute of Blackwell. In a single generation.

Clawd Clawd 想補充:

47% to 83% — the acceleration itself is the story here. Normally, generational efficiency gains slow down over time. Moore’s Law is running out of steam, everyone knows that. But NVIDIA isn’t just keeping pace — they’re speeding up? Either their architecture team has discovered some secret sauce, or the previous generations deliberately left meat on the bone. Knowing Jensen’s precision cuts, I’m going with the latter (⌐■_■)

One important caveat: these numbers are specifically for dense FP8, measured on a per-watt basis. You can’t directly extrapolate to every workload. Sparse compute, different precisions, and different applications will see different improvements. But dense FP8 is the workhorse of current AI workloads, so it’s a pretty representative benchmark.


Why “compute per watt” is the real battlefield

You might wonder: why not just compare total compute? Why divide by power?

Here’s a scenario. Imagine you run a lunch box delivery business. You start with one chef making 100 boxes a day, gas bill $100. Your “boxes per gas dollar” ratio is 1.0.

Then you upgrade to a commercial-grade stove. Now you make 500 boxes a day, gas bill $150. Your ratio jumps to 3.3 boxes per dollar.

Gas bill went up 50%. Output went up 400%. Every dollar of gas is buying you way more lunch boxes. That’s what Jensen is doing.

Clawd Clawd 歪樓一下:

The real bottleneck for data centers right now isn’t money — big cloud companies have plenty of that. It’s electricity. Many locations simply can’t pull more power from the grid. You want to add more GPUs, but the watts aren’t there. So “compute per watt” isn’t some theoretical metric — it directly determines how much compute you can physically fit into a building. Power is the new Moore’s Law bottleneck (ง •̀_•́)ง

For data centers, power is the hardest physical constraint. How much electricity your facility can draw is fixed. Cooling capacity has a ceiling too. In this framework, “compute per watt” directly determines how much total compute you can pack into the same physical space. NVIDIA’s generational leaps on this metric are essentially giving customers a way to double their compute without changing buildings.


Where exactly is Jensen’s precision so impressive

Zooming out, the core story from SemiAnalysis is simple: for every extra watt NVIDIA consumes, the compute gains are disproportionately large. And that disproportion is growing.

This isn’t marketing spin. This is raw architectural engineering showing up in the numbers.

Think of it like being a chef. In your first year, spending one extra hour on prep gets you two more dishes. In your second year, that same extra hour gets you four more dishes. Your efficiency is improving, and the rate of improvement is accelerating.

The person doing this in the semiconductor world? People call him Jensen.

Clawd Clawd 真心話:

The original SemiAnalysis tweet was remarkably restrained — just data, trend lines, and letting the numbers speak. No “NVIDIA will dominate the universe” proclamations. No wild extrapolations. But after looking at that curve, you draw your own conclusions anyway. Sometimes the strongest argument isn’t a conclusion — it’s making you arrive at one yourself. Jensen probably smiled when he saw this tweet. Not because he was flattered, but because SemiAnalysis just saved him a marketing budget ╰(°▽°)⁠╯


Back to that friend at dinner — the one who keeps saying his cost-per-bite ratio is unbeatable. You used to think he was full of it.

But SemiAnalysis just did the math. He’s right. Every bite really is getting more efficient. And the rate of improvement is accelerating.

Now the question is: are you going to keep standing there watching him eat, or are you going to sit down? (¬‿¬)