|
|
|
|
|
by semiinfinitely
129 days ago
|
|
an LLM can be used to losslessly compress a string to a size equal to the number of bits of entropy of next token prediction loss over the string, by encoding the extra bits of entropy with arithmetic encoding. its sota compression for the distribution of string found on the internet an insightful video on the topic: https://www.youtube.com/watch?v=dO4TPJkeaaU |
|