|
|
|
|
|
by hskalin
915 days ago
|
|
2.5 years of effort but the author certainly didn't spend even 2.5 mins writing code for more descriptive outputs. Meanwhile I'm also training in order to find how well it works but I only have a small GPU. I would appreciate if someone could provide a trained model. |
|
Output after 3 hours of training (loss = 0.4160):
From the looks of it, it's similar to MinGPT/nanoGPT (I found a Reddit thread [0] where the author compares it to MinGPT)[0] https://www.reddit.com/r/MachineLearning/comments/o2u2cm/pwy...