|
|
|
|
|
by refulgentis
997 days ago
|
|
This reminds me of a comment elsewhere I also replied to today: it's sort of hard to even pretend I have global usage stats, so I won't. There's a certain type of myopia that leads to overindexing on llama.cpp that makes it easy to classify.
to wit: > not aware of a competing format for quantized models ONNX, that's how its done in prod and on other models besides (and including) LLaMa. Quantization is a general technique. 100 small variants of llama2 GGML weights feels like spam from that perspective. (sort of civitai vs. huggingface, hugginface smartly stopped that with AI art). llm.mlc.ai for a more academic / less ad-hoc approach. > [stars on github] It's great for a very narrow & simple case that matches a large demographic on Github, and the demographics of people talking LLMs casually on HN: MacBook, wanna run locally and dream of a future free of having to ship your data to servers to get personalization. 5% of overall usage can be #2 in usage, if that makes sense. |
|
Most human people doing LLM at home aren't interested in cargo culting the for-profit corporate and instituational stuff since their resources and incentives are so different from human being's incentives. As there are more humans than corporations or institutions and they tend to talk more, what they use tends to be more known than the stuff optimized for making a profit and serving business needs with business culture.