| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zackangelo 361 days ago
	Yes, this is true. A lot of times labs will hold back necessary infrastructure pieces that allow them to train huge models reliably and on a practical time scale. For example, many have custom alternatives to Nvidia’s NCCL library to do fast distributed matrix math. Deepseek published a lot of their work in this area earlier this year and as a result the barrier isn’t as high as it used to be.