|
|
|
|
|
by mattalex
97 days ago
|
|
There were plenty of models the size of gpt3 in industry. The core insight necessary for chatgpt was not scaling (that was already widely accepted): the insight was that instead of finetuning for each individual task, you can finetune once for the meta-task of instruction following, which brings a problem specification directly into the data stream. |
|