|
|
|
|
|
by pstch
2389 days ago
|
|
An interesting - but not surprising - thing about this is that compression algorithms can be more efficient on wider representations of numerically-high code points (e.g, for some Korean corpus, using UTF-32 instead of UTF-8 improves LZMA compression by ~10%). |
|