| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mattalex 143 days ago
	There were plenty of models the size of gpt3 in industry. The core insight necessary for chatgpt was not scaling (that was already widely accepted): the insight was that instead of finetuning for each individual task, you can finetune once for the meta-task of instruction following, which brings a problem specification directly into the data stream.