Hacker News new | ask | show | jobs
by pyentropy 499 days ago
If H800 is a memory-constrained model that NVIDIA built to avoid the Chinese export ban on H100 with equivalent fp8 performance, it makes zero sense to believe Elon Musk, Dario Armodei and Alexandr Wang's claims that DeepSeek smuggled H100s.

The only reason why a team would allocate time on memory optimizations and writing NVPTX code rather than focusing on posttraining is if they severely struggled with memory during training.

I mean, take a look at the numbers:

https://www.fibermall.com/blog/nvidia-ai-chip.htm#A100_vs_A8...

This is a massive trick pulled by Jensen, take the H100 design whose sales are regulated by the government, make it look 40x weaker and call it H800, while conveniently leaving 8-bit computation as fast as H100. Then bring it to China and let companies stockpile without disclosing production or sales numbers, and have no export controls.

Eventually, after 7 months, US govt starts noticing the H800 sales and introduces new export controls, but it's too late. By this point, DeepSeek has started research using fp8. They slowly build bigger and bigger models, work on the bandwidth and memory consumptions, until they make r1 - their reasoning model.

2 comments

What's surprising is anyone would repeat Elon musk related things.

Tech or politics related, he's off the deep end.

Especially since he seems intent on everyone talking about him all the time. I find it questionable when a person wants to be the centre of attention no matter. Perhaps attention is not all we need.
Yet another casualty of laypersons browsing arXiv. That paper was like flypaper to his narcissism.
The problem is he's only wrong some of the time and then people arguing about which one it is this time generates attention, a valuable commodity.
Maybe “some” applied in the past but his recent history might best be described as “almost always”.
Drugs. Dont do that much drugs for so long.
He's like a broken smart network switch, smart as in managed. Packets with switch MAC on it are all broken, but erroneously forwarded ones often has valuable data. We through L3 don't know which one is which.
I'm wrong some of the times.

He's a lucky mensch, no more, no less.

Interesting how people keep calling it “the Chinese export ban”. Isn’t an American export ban?