Hacker News new | ask | show | jobs
by GrinningFool 23 days ago
That's a huge gap for llama.cpp server - any idea why?
1 comments

Best guess is it's native mode. The function calling template is just broken for Nemo.

I did go with an extreme example in the post (but true). Other deltas are smaller but still statistically significant. 30 pt swing between llamserver prompt vs ollama, 4-5pt swing between llamafile and llamaserver prompt.