|
my COmpREss algorithm CORE here it is: a sequence of bits that is not necessarily random can be made to look random - the bits are lined up with pseudorandom bits and each one is XORed in turn example data: 01011010 .. 01001000 pseudorandom bits: 10101110 .. 00100111 ... resulting bits: 11110100 .. 01101111 this is reversible and the pseudorandom bits take up no storage space in the compressed file since they can be replicated on demand "run": a run ends with the first zero bit encountered example: 01011010 is broken up into 4 runs (0, 10, 110, 10) probability of two runs taken together being of length two is .25 , three is .25 , four is .1875 , and so on ... for concreteness we make up some perfectly representative data using the following convention. i list the length of 2 runs taken together in this part. for instance, our ascii "Z" = 01011010 is broken up as 010 , 11010. now we have 3 , 5 to write in its place. note that this is ambiguous because 3 , 5 might equally represent 100 , 01110 (for just one example) "perfect data": 1,1,1,1,1,1,1,1,2,2,2,2,3,3,4,6 1,1,1,1,1,2,1,2,1,3,1,4,2,3,2,6 2,2,3,3,4,5,5,8 |