Hacker News new | ask | show | jobs
by WinLychee 1036 days ago
The above project somewhat supports GPUs if you pass the correct GGML compile flags to it. `GGML_CUBLAS` for example is supported when compiling. You get a decent speedup relative to pure C/C++.
1 comments

Interesting. It still doesn't seem to be very quick: https://github.com/leejet/stable-diffusion.cpp/issues/6

But don't get me wrong, I look forward to playing with ggml SD and its development.

Yeah for comparison, `tinygrad` takes a little over a second per iteration on my machine. https://github.com/tinygrad/tinygrad/blob/master/examples/st...
Is that on GPU or CPU? 1 it/s would be very respectable on CPU.

The fastest implementation on my 2060 laptop is AITemplate, being about 2x faster than pure optimized HF diffusers.

That was on GPU, and there are various CPU implementations (e.g. based on Tencent/ncnn) on github that have similar runtime (1-3s / iteration).