|
|
|
|
|
by rfurmani
482 days ago
|
|
I'm serving AI models on Lambda Labs and after some trial and error I found having a single vllm server along with caddy, behind cloudflare dns, to work really well and really easy to set up vllm serve ${MODEL_REPO} --dtype auto --api-key $HF_TOKEN --guided-decoding-backend outlines --disable-fastapi-docs & sudo caddy reverse-proxy --from ${SUBDOMAIN}.sugaku.net --to localhost:8000 & |
|
https://stackoverflow.com/questions/413807/