|
|
|
|
|
by chii
687 days ago
|
|
There's some speculation that there are higher horizons to the training, as explained in this video: https://www.youtube.com/watch?v=Nvb_4Jj5kBo the term for it is "grokking", amusingly. There's some indication that we are actually undertraining by 10x |
|