Y
Hacker News
new
|
ask
|
show
|
jobs
by
Der_Einzige
1217 days ago
In general, most giant LLMs are extremely undertrained at this time. Consider that most of the gains in RoBerta vs bert were from just continuing to train.
2 comments
stevenhuang
1217 days ago
Cases of undertraining can be observed whenever the output is repeating gibberish or loops. Happened a lot in GPT2 ai dungeon days
link
leobg
1216 days ago
So can we continue training RoBERTa to get it to, say, GPT3 Ada level
link