Hacker News new | ask | show | jobs
by no_time 990 days ago
>If you listen to BitTorrent's DHT network, you'll build an index of everything shared on BitTorrent (over time),

Correct me if I'm wrong but as far as I understand, passively listening on DHT would only mean you build up a list of infohashes of everything shared on BitTorrent. You'd actually have to reach out to your DHT peers to know what files the infohashes actually represents.

Wrapping back to grandparent's question of

>Also what happens if illegal content gets scooped up into the index?

I think this could get dicey if someone announces something very illegal like CP, and your crawler starts asking every peer that announced the infohash about it's contents with this[0] protocol. This would put your IP into a pretty awful exclusive club of

A, other crawlers

B, actual people wanting downloading said CP

[0]: https://www.bittorrent.org/beps/bep_0009.html

2 comments

> Correct me if I'm wrong but as far as I understand, passively listening on DHT would only mean you build up a list of infohashes of everything shared on BitTorrent. You'd actually have to reach out to your DHT peers to know what files the infohashes actually represents.

Yes, you're correct! I should have stated that, you still need to resolve the metadata from the peers that have the infohashed files hosted. That's a separate operation from downloading the file's content.

Would this get hashes of items shared on private trackers too?
No because private trackers enforce that all torrents uploaded have DHT,PEX and LPD disabled. Usually done by a single tickbox that says “Make torrent private” in the client.

Of course, respecting these options in the torrent file is still up to the client. This is one of the reasons why all private trackers have a client whitelist too.