| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 79 days ago
	ANE-powered inference (at least for prefill, which is a key bottleneck on pre-M5 platforms) is also in the works, per https://github.com/ggml-org/llama.cpp/issues/10453#issuecomm...