| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Traster 199 days ago
	Training is taking an enormous problem and trying to break it into lots of pieces and managing the data dependency between those pieces. It's solving 1 really hard problem. Inference is the opposite, it's lots of small independent problems. All of this "we have X many widgets connected to Y many high bandwidth optical telescopes" is all a training problem that they need to solve. Inference is "I have 20 tokens and I want to throw them at these 5,000,000 matrix multiplies, oh and I don't care about latency".

1 comments

I can't think of any case where inference doesn't care about latency.

I cant thinl of any reason training isnt going to become real time with a significant cpu budget.