Y
Hacker News
new
|
ask
|
show
|
jobs
by
Der_Einzige
559 days ago
Not even close. Llama.cpp isn't even close to a production ready LLM inference engine, and it runs overwhelmingly faster when using CUDA