| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mmaunder 606 days ago
	[flagged]

7 comments

accrual 606 days ago

Two days ago there was a pretty big discussion on this topic:

    Computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku
    https://news.ycombinator.com/item?id=41914989
    1421 points, 717 comments

link

refulgentis 606 days ago

I wouldn't be so haughty and presumptive of your understanding of things is as they are: this doesn't have practical applications.

No one serious is going to build on some horror of Python interpreter running inside your app to run an LLM when llama.cpp is right there, with more quants available. In practice, on mobile, you run out of RAM headroom way more quickly than CPU headroom. You've been able to run llama.cpp 3B models for almost a year now on iOS, whereas here, they're just starting to be able to. (allocating 6 GB is a quick way to get autokill'd on iOS...2.5GB? Doable)

It looks like spinquant is effectively Q8, in widespread blind testing over months, empirically, we found Q5 is assuredly indistinguishable from the base model.

(edit: just saw your comment. oy. best of luck! generally, I don't bother with these sorts of 'lived experience' details, because no one wants to hear they don't get it, and most LLM comments on HN are from ppl who don't have the same luck as to work on it fulltime. so you're either stuck aggressively asserting you're right in practice and they don't know what you're talking about, or, you're stuck being talked down to about things you've seen, even if they don't match a first-pass based on theory) https://news.ycombinator.com/item?id=41939841

link

pryelluw 606 days ago

I don’t get the comment. For one I’m excited for developments in the field. Not afraid it will “replace me” as technology has replaced me multiple times over. I’m looking towards working with these models more and more.

link

mmaunder 606 days ago

No, I meant that a lot of us are working very fast on a pre-launch product, implementing some cutting edge ideas using e.g. the incredible speedup in a small fast inference model like quantized 3B in combination with other tools, and I think there's quite a bit of paranoia out there that someone else will beat you to market. And so not a lot of sharing going on in the comments. At least not as much as previously, and not as much technical discussion vs other non-AI threads on HN.

link

pryelluw 606 days ago

Ok, thank you for pointing that out.

I’m focused on making models play nice with each other rather than building a feature that relies on it. That’s where I see the more relevant work being. Why such news are exciting!

link

mattgreenrocks 606 days ago

This thread attracts a smaller audience than, say, a new version of ChatGPT.

link

keyle 606 days ago

Aren't we all just tired of arguing the same points?

link

lxgr 606 days ago

What kind of fundamental discussion are you hoping to see under an article about an iterative improvement to a known model?

"AI will destroy the world"? "AI is great and will save humanity"? If you're seriously missing that, there's really enough platforms (and articles for more fundamental announcements/propositions on this one) where you can have these.

link

flawn 606 days ago

A sign of the ongoing commoditization?

link

yieldcrv 606 days ago

I mean, this outcome of LLMs is expected and the frequency of LLM drops are too fast, and definitely too fast to wait for Meta to do an annual conference with a ton of hype, and furthermore these things are just prerequisites for a massive lemming rush of altering these models for the real fun, which occurs in other communities

link