Hacker News new | ask | show | jobs
by anthonix1 697 days ago
I ported Karparthy's llm.c repo to AMD devices [1], and have trained GPT2 from scratch with 10B tokens of fineweb-edu on a 4x 7900XTX machine in just a few hours (about $2 worth of electricity) [2].

I've also trained the larger GPT2-XL model from scratch on bigger CDNA machines.

Works fine.

[1] https://github.com/anthonix/llm.c [2] https://x.com/zealandic1