| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mrazomor 732 days ago
	This assumes the common resources (CPU, RAM, etc.), not the ones required for the LLM training (GPU, TPU, etc.). It's different economy. TL; DR: It's not ~free.

1 comments

akutlay 732 days ago

Why does GPU matter? Do you think GCP keeps GPU utilization at 100% at all times?

link

mrazomor 732 days ago

What the OP is referring to requires overprovisioning of the high priority traffic and the sine-like utilization (without it, the benefits of the "batch" tier is close to zero -- the preemption is too high for any meaningful work when you are close to the top of the utilization hill).

You get that organically when you are serving lots of users. And, there's not much GPUs etc. used for that. Training LLMs gives you a different utilization pattern. The "best effort" resources aren't as useful in that setup.

link

bbminner 732 days ago

Because accelerators (tpus, gpus) unlike ram/cpu are notoriously hard to timeshare and vitrualize. So if you get evicted in an environment like that, you have to reload your entire experiment state from a model checkpoint. With giant models like that, it might take dozens of minutes. As a result, I doubt that these experiments are done using "spare" resources - in that case, constant interruptions and reloading would result in these experiments finishing sometime around the heat death of the universe :)

link