Hacker News new | ask | show | jobs
by candiddevmike 146 days ago
I think the rapid iteration and lack of consistency from the model providers is really killing the hype here. You see HN stories all the time around how things are getting worse, and it seems folks success with the major models is starting to heavily diffuse.

The model providers should really start having LTS (at least 2 years) offerings that deliver consistent results regardless of load, IMO. Folks are tired of the treadmill and just want some stability here, and if the providers aren't going to offer it, llama.cpp will...

1 comments

There is a difference between quantization of SOTA model and old models. People want non-quantized SOTA models, rather than old models.
Put that all aside. Why can’t they demo a model on max load to show what it’s capable of…?

Yeah, exactly.