Hacker News new | ask | show | jobs
by 0xb100db1ade 2622 days ago
I love the idea of IPFS, but I can't think of a use case not covered by torrents.

Would someone mind enlightening me regarding what sets IPFS apart from torrents?

5 comments

One practical difference is that all the parts of a "collection" are individually addressable in IPFS.

For example, unlike torrents, you can seed a collection like "My Web Show (All Seasons)" and add new files as new episodes become available. With torrents, you have to repackage them as new torrent files. IPFS also then encourages file canonicalization instead of everyone seeding their own copy of a file.

This is a over simplification I think. In IPFS to add a file to a folder, you need to rebuild that folder which changes the address. You still need to get people to access the new address to see any updates. IPNS makes that fairly easy, but a similar technology could be made for bittorrent.

I think what makes IPFS interesting is that all files are like torrents and all folders all like torrents of torrents.

And since each torrent is a hash if the file underneath it, if 100 people individually add files or folders that contain identical chunks, then without explicitly doing anything the are so helping each other share those files.

In traditional torrents, the files are concatenated and only then divided into chunks. So if I take a existing torrent and add a single 16-byte file to the beginning, there is a good chance it will have no common hashes between new and old one.

Update: There is apparently "bittorrent V2" protocol [1][2], which allows file sharing. It is still not implemented in major clients, like libtorrent[3]

[1] http://bittorrent.org/beps/bep_0052.html

[2] https://news.ycombinator.com/item?id=14951728

[3] https://github.com/arvidn/libtorrent/issues/2197

Not quite. They would need to use the same file and the same chunker. Two people can add the same identical file and share zero hashes. Files are broken down into chunks and those chunks are hashed I to a merkle tree.
I was at a place that tried to distribute their files via bittorrent. I wasn't there for the initial implementation, but have dealt with it after it was used.

The data was immutable, so we didn't have that use-case. The tracker software we were using (one of the often used open source C++ ones) seemed to handle a couple hundred torrents just fine, but couldn't handle tens of thousands. Even if only a few were active. I'm not sure if it was excessive RAM or high CPU, but they built a wrapper tool to expire and re-add torrents as needed. I think technically it was limiting the number of seeds (from the central server) for different torrents.

There was also a lot of time/overhead in initiating a new download. This was exacerbated by the kludge mentioned above. Client would add the torrent, you would trigger a re-seed, then the client would wait awhile before checking again and finding the seed. Often this dance took much longer than the download itself.

Think of torrents that you can update: You have the magnet link for the one version you are downloading, but also you have available the magnet link for the current version so the uploader can update at anytime and you would receive the update. And if both versions share some pieces, then people can share them across both torrents, and any other torrent that happens to have a piece with the same hash.
I would love to try that, can you share a quick and easy breakdown of how I would do publish that?
ipfs add -r mydir/

(Add a file to mydir/)

ipfs add -r mydir/

That's it, two different hashes, different contents, but intelligently deduplicated so you only need to download the diff if you already have the files in the former.

That's not much different than changing my files, making a new torrent file with a small blocksize and having users use that.

What is the benefit here?

That the old seeders can seed the new stuff without knowing about your new torrent.
Main feature is automatic data-sharing between distributions. With torrents, everything is siloed, and data is only exchanged between peers of that torrent. IPFS doesn't care /why/ you're getting information or the link you found it from, just that it can find it by its hash.

Say you distribute "Julie's Webcast Complete Series" and somebody else distributes "Julie's Webcast - Episode 3, with Russian subtitles," peers and seeders from both distributions can share data for the shared content. Similarly, updating a dataset only requires downloading the new data.

This is done automatically, both per-file hashing and (optionally, not sure the current state) of in-file block hashing.

> Julie's Webcast - Episode 3, with Russian subtitles

> peers and seeders from both distributions can share data for the shared content

So does IPFS have "plugins" for different archive/container formats so it can "see" that the underlying video/audio streams are identical between "Julie's Webcast - Episode 3.mp4" and "Julie's Webcast - Episode 3, with Russian subtitles.mkv"?

Otherwise container stream interleaving will play holy hell with any sort of "dumb" block hashing :(

Last I checked it was dumb. Possibly breaking block boundaries based on a rolling hash.
https://github.com/ipfs/go-ipfs-chunker

> go-ipfs-chunker provides the Splitter interface. IPFS splitters read data from a reader an create "chunks". These chunks are used to build the ipfs DAGs (Merkle Tree) and are the base unit to obtain the sums that ipfs uses to address content.

> The package provides a SizeSplitter which creates chunks of equal size and it is used by default in most cases, and a rabin fingerprint chunker. This chunker will attempt to split data in a way that the resulting blocks are the same when the data has repetitive patterns, thus optimizing the resulting DAGs.

I think they should use the rolling hash based chunking by default

https://github.com/ipfs/go-ipfs-chunker/issues/13

this is an implementation detail of the DHT client. if you have enough cooperating bit torrent clients set up to seed a sparse swarm like IPFS does, you could do the same thing.

which begs the question, why fork the DHT in the first place? there are BEP drafts that cover all of the features that IPFS (and DAT for that matter) bring to the table.

my guess: there isn't a lot of money in making yet another bit torrent client.

Yeah but you can't add an episode to the torrent later and have all existing peers seed the old episodes in the new torrent automatically.
you can only do that with IPFS keys that are aliased with IPNS which is equivalent to a BEP-46 mutable DHT key.

http://www.bittorrent.org/beps/bep_0046.html

Websites?