Hacker News new | ask | show | jobs
by WinLychee 1031 days ago
That was on GPU, and there are various CPU implementations (e.g. based on Tencent/ncnn) on github that have similar runtime (1-3s / iteration).