Y
Hacker News
new
|
ask
|
show
|
jobs
by
talldayo
623 days ago
According to this, you should be able to leverage multi-GPU machines using the stock-and-standard Llama.cpp:
https://github.com/ggerganov/llama.cpp/pull/1703