|
|
|
|
|
by andreygrehov
240 days ago
|
|
I don’t know what I’m talking about (pure fantasy), but what if you train a model on compressed data and then perform inference on compressed data as well? Could this work? With the output also being compressed and then decompressed by the client? |
|
Starting with already compressed data doesn't necessarily mean fewer tokens, you can probably assume similar entropy (or probably worse entropy) in expanding "Dictionary words" in a compressed stream versus "tokens" from a plaintext stream.