Hacker News new | ask | show | jobs
Show HN: LLMhop – A tiny, stateless router for LLMs with a NixOS module (github.com)
2 points by mlenz 10 days ago
LLMhop is a tiny stateless proxy for LLM inference servers. It tackles an issue I faced when trying to serve more than one local LLM at once which is not natively supported by vLLM. The LLMhop binary inspects the model field of the request and routes it to the correct backend service with optional handling of authentication. In addition, it contains a NixOS module to run llama.cpp, vLLM, and sglang via Quadlet/Podman and auto-register with the proxy.