Hacker News new | ask | show | jobs
by hskalin 915 days ago
2.5 years of effort but the author certainly didn't spend even 2.5 mins writing code for more descriptive outputs. Meanwhile I'm also training in order to find how well it works but I only have a small GPU. I would appreciate if someone could provide a trained model.
1 comments

So it's a character-based model (makes sense, because there's no tokenizer to be found). The model file is 3.3 MB

Output after 3 hours of training (loss = 0.4160):

  hello, the sounds of the rain assistants. In provations
  which had been banitz had taken step in, the spirit of the crowd, and
  the right winot to scrap on , Valuex, one of marks othhhanled me in a
  most powerful power. The other Augun had been in
  commands among that guests, the regiments were placed by imists, and
  science of irritation, but or thoughts that he would never cease to point
  before meant in the losses were discussing, and at once she had lived
  in the church.
From the looks of it, it's similar to MinGPT/nanoGPT (I found a Reddit thread [0] where the author compares it to MinGPT)

[0] https://www.reddit.com/r/MachineLearning/comments/o2u2cm/pwy...