| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jsemrau 769 days ago
	>The size of the cached internal state of the network processing the book is much larger than the size of the book It's funny that sometimes people consider LLMs as compression engines. While a lot of information gets lost in each direction (through the neural net)

1 comments

shwaj 768 days ago

Why is that funny? Sometimes compression is lossy, like JPEG and H.265

link

pornel 768 days ago

And the internal state of a JPEG decoder can be an order of magnitude larger than the JPEG file (especially progressive JPEG that can't stream its output).

link

okdood64 768 days ago

I don't lose anything with gzip or rar.

link

fwip 768 days ago

You can make any lossy compression scheme into a lossless scheme by appending the diff between the original and the compressed. In many cases, this still results in a size savings over the original.

You can think of this as a more detailed form of "I before E, except after C, except for species and science and..." Or, if you prefer, as continued terms of a Taylor-series expansion. The more terms you add, the more closely you approximate the original.

link

giancarlostoro 768 days ago

And just as fast? The issue here is how do you do these things both accurately and while maintaining reasonable speeds.

link