Hacker News new | ask | show | jobs
by mckirk 509 days ago
This looks great!

While we're at it, is there already some kind of standardized local storage location/scheme for LLM models? If not, this project could potentially be a great place to set an example that others can follow, if they want. I've been playing with different runtimes (Ollama, vLLM) the last days, and I really would have appreciated better interoperability in terms of shared model storage, instead of everybody defaulting to downloading everything all over again.

2 comments

The llama.cpp tools and examples download the models by default to a OS-specific cache folder [0]. We try to follow the HF standard (as discussed in the linked thread), though the layout of the llama.cpp cache is not the same atm. Not sure about the plans for RamaLama, but it might be something worth to consider.

[0] https://github.com/ggerganov/llama.cpp/issues/7252

I think it would be the most important thing to consider, because the biggest thing that predecessor to RamaLama provided was a way to download a model (and run it).

If there was a contract about how models were laid out on disk, then downloading, managing and tracking model weights could be handled by a different tool or subsystem.

In RamaLama an OCI container-like store is used (at least from the UX perspective it feels like that) for all models in RamaLama, it's protocol agnostic supports oci artefacts, huggingface, ollama, etc.
i just started to play with ollama and ramalama.. on linux. The models are quite some gigabytes.. not pretty to keep N copies..

ollama stores things under ~/.ollama/models/blobs/ named sha256-whatevershaisit

ramalama stores things under ~/.local/share/ramalama/repos/ollama/blobs/ named sha256:whatevershaisit

Note the ":" in ramalama names instead of the "-" .. that may not fly under windows.

if one crosslinks ramalama things over to ollama with that slight rename, ollama will remove them as they are not pulled via itself - no metadata on them.

i guess vllm etc everybody-else has yet-another schema and/or metadata.

btw Currently, arch-linux-wise, there is llm-manager (pointing to https://github.com/xyproto/llm-manager ), but it's made dependent on some of ollama packages, and can't be installed just by itself (without overforcing).