|
|
|
|
|
by jmorgan
1061 days ago
|
|
The llama.cpp project is absolutely amazing. Our goal was to build with/extend the project (vs try to be an alternative). Ollama was originally inspired by the "server" example: https://github.com/ggerganov/llama.cpp/tree/master/examples/... This project builds on llama.cpp in a few ways: 1. Easy install! Precompiled for Mac (Windows and Linux coming soon) 2. Run 2+ models: loading and unloading models as users need them, including via a REST API. Lots to do here, but even small models are memory hogs and they take quite a while to load, so the hope is to provide basic "scheduling" 3. Packaging: content-addressable packaging that bundles GGML-based weights with prompts, parameters, licenses and other metadata. Later the goal is to bundle embeddings and other larger files custom models (for specific use cases, a la PrivateGPT) would need to run. edit: formatting |
|