|
|
|
|
|
by davidjohnstone
1579 days ago
|
|
To improve compression of a sorted list of words you can replace the (initial) letters repeated from the word above with spaces before compression and add them back as an extra step after decompression. For example, if the previous word was "apple", the next entry will be " y" ("apply", edit: HN removes extra spaces, so this should be four spaces + "y") ("apple" will probably already be entered as " le" (three spaces + "le")). In theory a compression algorithm could handle this automatically, but in practice this gives better compression. This is conceptually similar to what OP does by storing the (numerical) difference between the words. Also, if you have a list of numbers that aren't random, they generally compress better if you turn it into a list of the differences between the numbers. A simple compression algorithm (miniLZO is apparently 6KB compiled) might be small enough and save enough bytes with compression to make it worth it for OP. |
|
If you want to skim, check out the EXAMPLE section toward the bottom.