|
|
|
|
|
by brucethemoose2
987 days ago
|
|
llama.cpp (and derivative projects) is quickly becoming SOTA for many use cases, and it basically has zero dependencies. Kobold.cpp, for example, provides an entire web UI and API with python, and 3 python packages (numpy, sentencepiece, and gguf which is the llama.cpp library). The llm itself is a single file you can get with curl or whatever. It takes less than a minute to compile against the native CPU/acclerator architecture, with nothing but the GPU libs themself, which nets better performance than a generic binary distribution. ...Its not "one line" I guess, but I can hardly imagine a simpler setup. It doesn't really need docker or a fancy container. |
|
lep photon run -n sdxl -m hf:stabilityai/stable-diffusion-xl-base-1.0 --local
It's really about how to productize a wide range of models as easy as possible.