Y
Hacker News
new
|
ask
|
show
|
jobs
by
asselinpaul
1117 days ago
This is great! CPU only for now or can it leverage GPU for sped up inference?
1 comments
ybu
1117 days ago
Focus is CPU for now - but it'd be very useful to have GPU support. As a baseline, this can support anything that llama.cpp. Relevant link:
https://github.com/ggerganov/llama.cpp#blas-build
link