|
|
|
Show HN: Smile-Serve – Inference Server for ML, ONNX, and LLM
(github.com)
|
|
5 points
by haifeng
42 days ago
|
|
SMILE Serve is a production-ready inference server built on [Quarkus](https://quarkus.io/)
that brings together three complementary inference capabilities on the JVM: - **Classic ML**: `/api/v1/models` for serialized SMILE models (`.sml`)
- **ONNX Runtime**: `/api/v1/onnx` for any model in the ONNX open format (`.onnx`)
- **LLM Chat**: `/api/v1/chat` for Llama 3 chat completions
A React-based web UI is bundled and served from the same process. |
|