|
|
|
|
|
by mellosouls
51 days ago
|
|
The page describes its relationship to nanogpt. ...nanoGPT targets reproducing GPT-2 (124M params) and covers a lot of ground. This project strips it down to the essentials and scales it to a ~10M param model that trains on a laptop in under an hour... |
|