Hacker News new | ask | show | jobs
by miki123211 2 days ago
LLM compression doesn't necessarily have to be lossy.

You can use the fact that LLMs predict P(next token | existing tokens) to losslessly and efficiently compress arbitrary token sequences. This idea is closely related to arithmetic coding.

2 comments

True, but it's not relevant because that isn't how we actually train LLMs for use as quasi-intelligent tools. We specifically do not want the model to be able to just memorize its input, which is what your process requires.

Many things about the process are similar, so there's some analogy, but it just isn't the same.

When decompressing, you need to reproduce the output of the LLM exactly as it was during compression, otherwise the decompressor would output gibberish. Can you count on the LLM being that consistent?