Hacker News new | ask | show | jobs
by Dylan16807 564 days ago
> For the Set example, where would the insertion order come from?

It would come from however the files were transferred from competitor computer to verifier computer.

> For the tar file, the tar file would be larger than the input file it's supposed to be "compressing".

It sure would be! I don't see how that's relevant to the filename discussion though?

1 comments

The whole point of the filename discussion is that it's a trick to "compress" the data, such that the sum of the file size of the input files is smaller than the file size of the decompressed output file.

Neither of your ideas work with this.

In terms of just transferring the files in order, then you need to delineate the start and end of each file, which will take more space than the byte you are removing. Same with the tar file.

> The whole point of the filename discussion is that it's a trick to "compress" the data

I think you've fundamentally misunderstood my point.

You seem to think I'm trying to make it work without a trick, but I'm not doing that. I'm saying yes there is a trick, but the trick is not based around filenames. The trick needs a sequence of variable-size blobs of bytes, and there's a lot of ways to maintain a sequence.

The trick in the OP is absolutely based around filenames, as a source of ordering the input files. I agree that you can use the same idea if you can order the files a different way, but I don't understand why that is significant.
Because it seems very unfair to call it "filename shenanigans". The filenames are only there to point the script at the right file. Filename shenanigans would be something like putting actual bytes into the filename.

If you patched `cat` to ignore filename and just spit out each file as given, the script would still work without a single change. If you slightly changed the script to loop over the results of `ls`, it could still be compatible with scrambled filenames.

A script that didn't cheat would also be using filenames to a similar level.

In other words the filenames are a completely fair implementation detail. And that detail can be swapped out without changing the trick in any meaningful way.

The trick is based on having a series of variable-sized blobs of bytes. That's all it needs. If I use javascript instead of sh, and my decompressor is `[...s].join('5')`, I'm using the same trick.

> If you patched `cat` to ignore filename and just spit out each file as given, the script would still work without a single change. If you slightly changed the script to loop over the results of `ls`, it could still be compatible with scrambled filenames.

This isn't true! If you scrambled the filenames, the files would be put together in the wrong order and the result would be incorrect. You would need to also transmit the order that the files would be put together separately, which again, together with the size of the files themselves, would be greater than the size of the output.

The key thing here is that the trick works by storing the information of how the blobs are ordered out-of-band. In the OP, that out-of-band place to store the blob order is filename. In your JS example of `[...s].join('5')`, where does the order of [...s] come from? It's not something you can hand-wave away, it's the key thing that makes the trick work.

> This isn't true! If you scrambled the filenames

I said "could" because you'd have to either do a limited scramble or hotwire ls to use the right order despite the scrambling. Or sort by date or inode, probably.

> The key thing here is that the trick works by storing the information of how the blobs are ordered out-of-band.

Yes. That is the key, not the filenames.

> In the OP, that out-of-band place to store the blob order is filename.

It is, but the actual use of filenames is not a shenanigan, and the blob order could be easily accomplished without any particular filenames.

> In your JS example of `[...s].join('5')`, where does the order of [...s] come from? It's not something you can hand-wave away, it's the key thing that makes the trick work.

It comes from the process of loading the blobs onto the computer. I'm not trying to hand-wave it, I'm saying it doesn't need filenames or anything resembling filenames. Maybe it came from a tar. Maybe I sent each file in a separate email. All that matters is having an order, and having an order happens by default when you have multiple files. As long as you don't go out of your way to reorder things, the trick works.