Hacker News new | ask | show | jobs
by kianN 530 days ago
For those wondering how it works:

> The language model predicts the probabilities of the next token. An arithmetic coder then encodes the next token according to the probabilities. [1]

It’s also mentioned that the model is configured to be deterministic, which is how I would guess the decompression is able to map a set of token likelihoods to the original token?

[1] https://bellard.org/ts_zip/

2 comments

> ts_zip

Discussed (once more) in a neighbor thread: https://news.ycombinator.com/item?id=42549083

isn’t a LLM itself basically a compression of the texts from the internet? you can download the model and decompress the (larger) content with compute power (lossy)
Yeah that’s exactly how I think of llms in my head: lossy compression that interpolates in order to fill in gaps. Hallucination is simply interpolation error. Which is guaranteed in lossy compression.