| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by anthonix1 792 days ago
	Seems to be an issue on their side. E.g., for a step of GPT2 training on a 7900 XTX [1]: tinygrad is ~440ms, PyTorch 2.4.0.dev20240513 is ~97ms, Karpathy's llm.c with ROCm is ~79ms, and llm.c with custom kernels is ~58ms [1] https://github.com/anthonix/llm.c [2] https://github.com/tinygrad/tinygrad/issues/4301

1 comments

xiphias2 792 days ago

That issue seems a month old, while the 58ms number looks 1 day old.

I have seen last month getting a lot of work done in improving performance (it's in the release announcement as well), but of course I still don't think it can compete with that number...still, a new comparision would be cool.

link

anthonix1 792 days ago

Ran tinygrad again about a week ago, no change.

And still no comment on the issue, will re-run if there is any comment.

link

xiphias2 792 days ago

Thanks for the answer

link