| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cosmez 775 days ago
	There are many pull requests trying to implement this feature, and they don't even care to reply. This is the only reason I'm still using llama.cpp serve instead of this.

1 comments

wouldnt it be more practical to make a PR for llamacpp to replicate what Ollama does well instead?