Y
Hacker News
new
|
ask
|
show
|
jobs
by
novaomnidev
925 days ago
Why is this faster than running llama.cpp main directly? I’m getting 7 tokens/ sec with this. But 2 with llama.cpp by itself