| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pyentropy 547 days ago

If H800 is a memory-constrained model that NVIDIA built to avoid the Chinese export ban on H100 with equivalent fp8 performance, it makes zero sense to believe Elon Musk, Dario Armodei and Alexandr Wang's claims that DeepSeek smuggled H100s.

The only reason why a team would allocate time on memory optimizations and writing NVPTX code rather than focusing on posttraining is if they severely struggled with memory during training.

I mean, take a look at the numbers:

https://www.fibermall.com/blog/nvidia-ai-chip.htm#A100_vs_A8...

This is a massive trick pulled by Jensen, take the H100 design whose sales are regulated by the government, make it look 40x weaker and call it H800, while conveniently leaving 8-bit computation as fast as H100. Then bring it to China and let companies stockpile without disclosing production or sales numbers, and have no export controls.

Eventually, after 7 months, US govt starts noticing the H800 sales and introduces new export controls, but it's too late. By this point, DeepSeek has started research using fp8. They slowly build bigger and bigger models, work on the bandwidth and memory consumptions, until they make r1 - their reasoning model.

2 comments

cyanydeez 547 days ago

What's surprising is anyone would repeat Elon musk related things.

Tech or politics related, he's off the deep end.

link

mnky9800n 547 days ago

Especially since he seems intent on everyone talking about him all the time. I find it questionable when a person wants to be the centre of attention no matter. Perhaps attention is not all we need.

link

K0balt 547 days ago

Yet another casualty of laypersons browsing arXiv. That paper was like flypaper to his narcissism.

link

AnthonyMouse 547 days ago

The problem is he's only wrong some of the time and then people arguing about which one it is this time generates attention, a valuable commodity.

link

m-s-y 547 days ago

Maybe “some” applied in the past but his recent history might best be described as “almost always”.

link

Muromec 547 days ago

Drugs. Dont do that much drugs for so long.

link

numpad0 547 days ago

He's like a broken smart network switch, smart as in managed. Packets with switch MAC on it are all broken, but erroneously forwarded ones often has valuable data. We through L3 don't know which one is which.

link

cyanydeez 547 days ago

I'm wrong some of the times.

He's a lucky mensch, no more, no less.

link

schubart 547 days ago

Interesting how people keep calling it “the Chinese export ban”. Isn’t an American export ban?

link