Hacker News new | ask | show | jobs
by eru 1 day ago
> If you use an LLM just to provide an estimation for the frequencies of tokens in an input data stream, [...]

Why would you use an LLM for that? The whole point is to encode contextual probabilities. So basically: given this prefix of text, what's are the probabilities for next tokens? You can use this conditional probability distribution to sample from to create plausible text, or you can use it for lossless compression. The math is very similar.