Hacker News new | ask | show | jobs
by elorant 537 days ago
How is China training models without access to cutting edge GPUs?
6 comments

They have access to cutting edge GPUs via rentals:

https://www.msn.com/en-us/money/markets/bytedance-plans-to-s...

- using non-cutting edge GPUs (just more of them)

- creating more efficient models such as MoE based DeepSeek

- getting their hands on cutting edge GPUs all the same

I think it was Dylan Patel (from semianalysis) on Dwarkesh that mentioned one scam is for a Chinese source to arrange for a SOTA NVidia cluster to be bought/installed in some non-embargoed country, then dismantled and shipped to China.

They're being more efficient about it by the looks of things, rather than brute-forcing things...

https://x.com/karpathy/status/1872362712958906460

The same way drug users access illegal drugs.
Pretty easily, it turns out?
very carefully