| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blackeyeblitzar 791 days ago
	> It looks like a mid-level implementations of training and inference I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?

1 comments

This is probably a good baseline to start thinking about LLM training at scale.