| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by anthonix1 745 days ago

I ported Karparthy's llm.c repo to AMD devices [1], and have trained GPT2 from scratch with 10B tokens of fineweb-edu on a 4x 7900XTX machine in just a few hours (about $2 worth of electricity) [2].

I've also trained the larger GPT2-XL model from scratch on bigger CDNA machines.

Works fine.

[1] https://github.com/anthonix/llm.c [2] https://x.com/zealandic1