Hacker News new | ask | show | jobs
by refulgentis 997 days ago
This reminds me of a comment elsewhere I also replied to today: it's sort of hard to even pretend I have global usage stats, so I won't.

There's a certain type of myopia that leads to overindexing on llama.cpp that makes it easy to classify. to wit:

> not aware of a competing format for quantized models

ONNX, that's how its done in prod and on other models besides (and including) LLaMa. Quantization is a general technique. 100 small variants of llama2 GGML weights feels like spam from that perspective. (sort of civitai vs. huggingface, hugginface smartly stopped that with AI art).

llm.mlc.ai for a more academic / less ad-hoc approach.

> [stars on github]

It's great for a very narrow & simple case that matches a large demographic on Github, and the demographics of people talking LLMs casually on HN: MacBook, wanna run locally and dream of a future free of having to ship your data to servers to get personalization. 5% of overall usage can be #2 in usage, if that makes sense.

2 comments

> done in prod ... hugginface smartly stopped that with AI art ... more academic

Most human people doing LLM at home aren't interested in cargo culting the for-profit corporate and instituational stuff since their resources and incentives are so different from human being's incentives. As there are more humans than corporations or institutions and they tend to talk more, what they use tends to be more known than the stuff optimized for making a profit and serving business needs with business culture.

> This reminds me of a comment elsewhere I also replied to today

Right, looks like you made fun of / were condescendingly dismissive of my comment in another thread, I wouldn't have replied here if I'd realized you were the same person.

LOL I was thinking of an entirely different comment on another site. Give me credit here, I never cast aspersions on you, or even addressed you directly here.

I apologize for making you feel condescended to, but also would like to point out the _mean_ comment is +7, much less this one: there's a pretty significant gap in your knowledge and reality is going to keep intruding. Engaging in public is a wonderful way to learn, but you're coming across as glib and assertive and uninformed. You thought llama.cpp invented quantization and there's no other real format? :X