|
|
|
|
|
by killerstorm
848 days ago
|
|
There's a principle more powerful than the bitter lesson: GIGO. Training to predict internet dump can only give you so much. There's a paper called something like "learning from textbooks" where they show that a small model trained on high-quality no-nonsense dataset can beat a much bigger model at a task like Python coding. |
|