| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sp332 877 days ago
	That’s crazy, I’ve never seen one that dropped whole layers from a pre-trained model. I guess that avoids the sparse matrix math.