Hacker News new | ask | show | jobs
by kevinlu1248 145 days ago
Unfortunately, the main optimization (3x speedup) is using n-gram spec dec which doesn't run on CPUs. But I believe it works on Metal at least.