Hacker News new | ask | show | jobs
by MonkeyClub 483 days ago
I get the same vibe, especially after reading the update where the book author contacts him to clarify stuff.

The whole post reads like hasty clumsy grey marketing.

1 comments

I wrote the blog post, and I did it on my own freewill, and am receiving no compensation. The main reason I wrote it was to help cement my own learnings from the book. I've heard that the best way to learn something is to teach it, so I wanted to see how much I could regurgitate on my own. Turns out, not a whole lot. It was hastily written, and more of a "brain dump" than anything else. I'm entering a new-to-me field, and wanted a place to document the things I'm learning. If anyone finds it interesting, great. If not, no big deal.

As for the specifics of the model I trained, I would be hard pressed to recall the specifics off the top of my head. I believe I trained a small model locally, but after completing that as a PoC, I downloaded the GPT-2 model weights, then trained / fine-tuned those locally. That is what the book directed. All the steps are in my github repo, which (unsurprisingly) like the author's repo. His repo actually has more explanation. Mine is more or less just code.