Y
Hacker News
new
|
ask
|
show
|
jobs
by
deoxykev
506 days ago
Yeah, there is a clear bottleneck somewhere in llama.cpp. Even high end hardware is struggling to get good numbers. The theoretical limit should be higher, but it's not yet.
Benchmarks:
https://github.com/ggerganov/llama.cpp/issues/11474#issuecom...