| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jstx1 1015 days ago
	> Train your LLM at scale on our infrastructure Is it really their infrastructure or are they using a cloud provider and this wraps it up and provides convenience for a price?

3 comments

brucethemoose2 1014 days ago

Azure and such get such massive scaling cost benefits from scaling that HF's own GPUs would probably be more expensive anyway, even if they go AMD/Intel.

It does seem like they should run their own storage nodes, with the sheer quantity of models they host...

link

fxtentacle 1014 days ago

Everyone claims that, yet I have never seen it happen.

Typically, small companies get rebates on NVIDIA GPUs, but big established ones do not. So I would expect a startup with 100 GPUs to pay less per GPU than Azure.

link

jsemrau 1015 days ago

I'd think "infrastructur" includes the nice front end and Python API that they have proven to be capable to pull off already.

link

perfmode 1015 days ago

What’s the difference?

link

melx 1015 days ago

You end up paying more in the latter instance.

link

marcinzm 1015 days ago

Not counting the cost of learning how to cluster together 500 GPUs, the cost of learning how to train models efficiently on 500 GPUs, the cost of convincing a cloud provider to let you get 500 GPUs, the cost of trying to find a cloud provider that actually has 500 GPUs you can book, etc, etc.

link