Hacker News new | ask | show | jobs
by WinLychee 1030 days ago
Yeah for comparison, `tinygrad` takes a little over a second per iteration on my machine. https://github.com/tinygrad/tinygrad/blob/master/examples/st...
1 comments

Is that on GPU or CPU? 1 it/s would be very respectable on CPU.

The fastest implementation on my 2060 laptop is AITemplate, being about 2x faster than pure optimized HF diffusers.

That was on GPU, and there are various CPU implementations (e.g. based on Tencent/ncnn) on github that have similar runtime (1-3s / iteration).