Hacker News new | ask | show | jobs
by mafuy 565 days ago
Not really a good point. If the order of bytes does not matter, then I can compress any file of your liking to O(log n) size :P
1 comments

Wait, whose point are you saying is not good?

I'm saying order does matter and it's the only thing that matters about the separate files using this code.

I think the question is, if you remove the filenames entirely, how do you keep the parts ordered?

(Someone else suggested sorting them by file size.)

You have to be storing them outside a traditional filesystem to not have filenames, so the way you keep them ordered depends on what your storage mechanism is.

For example, you could store them in a Set object in many programming languages, one that preserves insertion order. Or you could be extracting them one by one from a tar file that has blank filenames stored in it.

For the Set example, where would the insertion order come from? For the tar file, the tar file would be larger than the input file it's supposed to be "compressing".
> For the Set example, where would the insertion order come from?

It would come from however the files were transferred from competitor computer to verifier computer.

> For the tar file, the tar file would be larger than the input file it's supposed to be "compressing".

It sure would be! I don't see how that's relevant to the filename discussion though?

The whole point of the filename discussion is that it's a trick to "compress" the data, such that the sum of the file size of the input files is smaller than the file size of the decompressed output file.

Neither of your ideas work with this.

In terms of just transferring the files in order, then you need to delineate the start and end of each file, which will take more space than the byte you are removing. Same with the tar file.