Hacker News new | ask | show | jobs
by cosmez 728 days ago
There are many pull requests trying to implement this feature, and they don't even care to reply. This is the only reason I'm still using llama.cpp serve instead of this.
1 comments

wouldnt it be more practical to make a PR for llamacpp to replicate what Ollama does well instead?