|
|
|
|
|
by cettox
4362 days ago
|
|
As many pointed that out using Pidgeon Hole principle, it is not practical to create a compression index(A lookup index where you map actual data with some kind of adresses preferably smaller than sequences), using every possible n byte sequence of your data! Because your index size would be at least equal or higher than your original data. The only way you get a smaller compression index, you have to look for recurrences, and try to only include most recurring sequences up to a number(there would be a tradeof and an optimal number for compression ratio) and left other sequences uncompressed. Only this way you can achieve compression ratio's smaller than 100%. |
|