Hacker News new | ask | show | jobs
by zozbot234 76 days ago
If it's just about skipping some buffer sync that's something that could also be adopted by llama.cpp's own Metal backend, at least on Apple Silicon platforms.