|
|
|
|
|
by ComplexSystems
333 days ago
|
|
Good stuff. You could get much better bandwidth than this by tokenizing and using something like a Huffman or arithmetic code on token frequencies. As a simple example, if you set your tokens to be all English words - let's say there are between 500k and 1 million - that's about 9-10 bits per word. I am sure you could do much better than this as well |
|
https://arxiv.org/abs/2306.04050
https://bellard.org/ts_zip/