| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by evrydayhustling 1162 days ago
	That essay works in a context of specific datasets and tasks, which are referenced in the surrounding sentences and paragraphs. They are saying that for a particular "emergent" capability you might reach with a giant LLM, you might get there more efficiently with distillation / LoRa. My comment is about generality, which is the remaining advantage of giant models.