Hacker News new | ask | show | jobs
by predakanga 4573 days ago
Unfortunately it's not as simple as that - when asking each other for data, the individual peers ask for a particular 'piece' of the torrent, where that piece isn't relative to a given file, but the torrent as a whole.

The files are concatenated into one long stream, and the piece number is an index to that, with no guarantees about alignment.

For instance, if you have a torrent (we'll call it 'X') with three files: the 4mb file 'a', the 3mb file 'b' and the 1mb file 'c', and two separate torrents ('Y' and 'Z') describing files 'b' and 'c' seperately, then the pieces would map something like this:

'Y' piece 1 -> 'X' piece 17 'Z' piece 1 -> 'X' piece 29

That's an absolute best case scenario though - in most cases, file sizes aren't quite as perfect as that (each being a multiple of the default piece size, 256kb). If 'b' just happened to be 1373kb, or anything else that wasn't a multiple of 256kb, then any files after it aren't addressable from other torrents.

1 comments

Why not?

You just have at most two blocks of additional overhead.

You would have to have where the file begins and ends within the blocks downloaded, but that's already in the torrent file.

Because the hashes that are stored in the .torrent operate on that unaligned data.

In practice, what this means is that you can't verify that two files of the same name and size but at different alignments within the consolidated data stream are identical; you can't compare hashes, can't do anything without first downloading. This opens the door to mass poisoning of swarms without even having to enter them in the first place.

There are potential solutions (including providing a broader hash per-file, as opposed to per-piece), but my statement was only that it's not that simple, not that it's impossible.

Why do you want to be completely backwards compatible with classic Torrents? Torrent2 can dump some features of classic torrenting, like folder structure, and mandate that each "subtorrenat" is basically a single Torrent1 containing only 1 file and no folder structure.