I'm excited to share with you a new open-source project called langport, which aims to provide a lightning-fast large language model serving platform.
Inspired by lmsys/fastchat, we've built a distributed LLM serving system. Our focus on performance means we use batch inference to get higher throughput on the serving platform.
Langport offers a range of core features, including streaming API interface support, batch inference for higher throughput, and OpenAI-Compatible RESTful APIs.
Inspired by lmsys/fastchat, we've built a distributed LLM serving system. Our focus on performance means we use batch inference to get higher throughput on the serving platform. Langport offers a range of core features, including streaming API interface support, batch inference for higher throughput, and OpenAI-Compatible RESTful APIs.