|
|
|
|
|
by garyfirestorm
148 days ago
|
|
Everyone seems to be missing important piece here. Ollama is/was a one click solution for non technical person to launch a local model. It doesn’t need a lot of configuration, detects Nvidia GPU and starts model inferencing with single command.
Core principle being your grandmother should be able to launch local AI model without needing to install 100 dependencies. |
|
I can be in a non-technical team, and put the LLM code inside docker.
The local dev instruction is to install ollama and use it to pull the models and set some env vars.
The same code can point at bedrock when deployed there.
Using straight llamacpp at the time I wrote that it wasn't as straightforward.