|
|
|
|
|
by activatedgeek
939 days ago
|
|
I looked at Ollama before, but couldn't quite figure something out from the docs [1] It looks like a lot of the tooling is heavily engineered for a set of modern popular LLM-esque models. And looks like llama.cpp also supports LoRA models, so I'd assume there is a way to engineer a pipeline from LoRA to llama.cpp deployments, which probably covers quite a broad set of possibilities. Beyond llama.cpp, can someone point me to what the broader community uses for general PyTorch model deployments? I haven't quite ever self-hosted models, and am really keen to do one. Ideally, I am looking for something that stays close to the PyTorch core, and therefore allows me the flexibility to take any nn.Module to production. [1]: https://github.com/jmorganca/ollama/blob/main/docs/import.md. |
|