Hacker News new | ask | show | jobs
by veber-alex 241 days ago
The llama.cpp issues are strange.

There are official benchmarks of the Spark running multiple models just fine on llama.cpp

https://github.com/ggml-org/llama.cpp/discussions/16578

3 comments

There wasn't any instructions how the author got ollama/llama.cpp, could possibly be something nvidia shipped with the DGX Spark and is an old version?
Llama.cpp main branch doesn't run on Orins so it's actually weird that it does run on the Spark.
Cool I’ll have a look. All reflections I made were first pass stuff.