Hacker News new | ask | show | jobs
by MPSimmons 348 days ago
Agreed. It's basically lossy compression for everything it's ever read. And the quantization impacts the lossiness, but since a lot of text is super fluffy, we tend not to notice as much as we would when we, say, listen to music that has been compressed in a lossy way.
2 comments

It's a bit like if you trained a virtual band to play any song ever, then told it to do its own version of the songs. Then prompted it to play whatever specific thing you wanted. It won't be the same because it kinda remembers the right thing sorta, but it's also winging it.
I've been referring to LLMs as JPEG for all the world's data, and people have really started to come around to it. Initially most folks tended to outright reject this comparison.
Ted Chiang wrote a great piece about that: https://www.newyorker.com/tech/annals-of-technology/chatgpt-...

I think it's a solid description for a raw model, but it's less applicable once you start combining an LLM with better context and tools.

What's interesting to me isn't the stuff the LLM "knows" - it's how well an LLM system can serve me when combined with RAG and tools like web search and access to a compiler.

The most interesting developments right now are models like Gemma 3n which are designed to have as much capability as possible without needing a huge amount of "facts" baked into them.