|
|
|
|
|
by kianN
530 days ago
|
|
For those wondering how it works: > The language model predicts the probabilities of the next token. An arithmetic coder then encodes the next token according to the probabilities. [1] It’s also mentioned that the model is configured to be deterministic, which is how I would guess the decompression is able to map a set of token likelihoods to the original token? [1] https://bellard.org/ts_zip/ |
|
Discussed (once more) in a neighbor thread: https://news.ycombinator.com/item?id=42549083