| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by asselinpaul 1164 days ago
	This is great! CPU only for now or can it leverage GPU for sped up inference?

1 comments

Focus is CPU for now - but it'd be very useful to have GPU support. As a baseline, this can support anything that llama.cpp. Relevant link: https://github.com/ggerganov/llama.cpp#blas-build