|
|
|
|
|
by gliptic
536 days ago
|
|
The model gives you a probability distribution over the tokens. You could use that directly with arithmetic coding, but there are ways to convert that to a distribution over e.g. the next byte instead which would improve efficiency further by removing the redundancy in alternative token encodings. ts_zip does this, and README says this works similar to ts_zip. EDIT: Hm, or maybe ts_zip uses just the token probabilities directly. I thought it was slightly more efficient about it. "The language model predicts the probabilities of the next token. An arithmetic coder then encodes the next token according to the probabilities." |
|