Hacker News new | ask | show | jobs
by ricardobeat 531 days ago
Models like Llama 3 are trained on sixteen thousand GPUs, OpenAI probably 25k-100k GPUs. This is the kind of scale the sanctions make a lot harder to achieve.
2 comments

> This is the kind of scale the sanctions make a lot harder to achieve

Harder and more expensive, but far from impossible. I doubt they pay more than doubles Nvidia's sticker price, all told. My comment was inspired by recent real-life events; Nvidia got into legal trouble in the last couple of months for turning a blind eye to questionable transactions - if you're curious about the mechanics of GPU sanctions-busting, read up on the governments accusations against Nvidia, and this was for low-hanging fruit.

The article has one analyst speculating they used tens of thousands of H100s (50,000 IIRC) instead of the 10,000 A100s the Deepseek CEO owns up to. They can afford to pay exorbitant markups for the logistical nightmare of importation through 3rd party countries at scale.

edit: AFAIK, the sanctions don't prevent Chinese AI labs from renting GPUs from any cloud provider. To simplify logistics, a shell company could avoid shipping the cards to the mainland by simply settings up a data center in not-China and give the parent company full access. I suppose the US government has to balance sanctions against Nvidia's share price, so they can't be too aggressive, there are just too many loopholes for demanded shock not to have been a consideration.

That really sounds plausible. They could open a training farm in Vietnam or Mongolia (or even Taiwan, or really anywhere the internet goes) and just use it, the GPUs don’t need to be located in the mainland. The only way to lock down the GPUs would be to just restrict completely to who could use them, and then prevent them from contracting with sanctioned entities.

When I was working in Beijing, we definitely had resources we couldn’t access locally but could easily access remotely so it didn’t really matter.

Singapore, Taiwan and South Korea are closeby and large+stable enough to easily create such data centers. Small batches for experimentation by team can be easily smuggled. China is too large and powerful to be completely sanctioned.
The Information claimed today that ByteDance is renting GPUs in the cloud, although ByteDance denies it (well, they call it "inaccurate" which is not exactly a strong rebuttal).

https://www.tomshardware.com/tech-industry/artificial-intell...

You are under-estimating the powers of the “black market”. North Korea and Cuba are shithole countries because their governments are shit and not solely because of sanctions.

If you follow the news, several people (bankers) will trade with Iran despite the repercussions (jail) of doing so. There is a premium but at the right price someone will execute.