Hacker News new | ask | show | jobs
by ruihangl 740 days ago
A unified efficient open-source LLM deployment engine for both cloud server and local use cases.

It comes with full OpenAI-compatible API that runs directly with Python, iOS, Android, browsers. Supporting deploying latest large language models such as Qwen2, Phi3, and more.