|
|
|
|
|
by coder543
530 days ago
|
|
For compressing short (<100 bytes), repetitive strings, you could potentially train a zstd dictionary on your dataset, and then use that same dictionary for all rows. Of course, you’d want to disable several zstd defaults, like outputting the zstd header, since every single byte counts for short string compression. |
|