| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kianN 577 days ago

For those wondering how it works:

> The language model predicts the probabilities of the next token. An arithmetic coder then encodes the next token according to the probabilities. [1]

It’s also mentioned that the model is configured to be deterministic, which is how I would guess the decompression is able to map a set of token likelihoods to the original token?

[1] https://bellard.org/ts_zip/

2 comments

kvemkon 577 days ago

> ts_zip

Discussed (once more) in a neighbor thread: https://news.ycombinator.com/item?id=42549083

link

cyptus 577 days ago

isn’t a LLM itself basically a compression of the texts from the internet? you can download the model and decompress the (larger) content with compute power (lossy)

link

kianN 577 days ago

Yeah that’s exactly how I think of llms in my head: lossy compression that interpolates in order to fill in gaps. Hallucination is simply interpolation error. Which is guaranteed in lossy compression.

link