| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by retonato 1331 days ago

If despite all this you still decide to go forward - don't forget to import all the available torrent data on the internet (there are tens of dumps here and there), that way you will have MUCH larger database, than just by DHT scraping alone. This is a good place to start: https://archive.org/details/torrent_metadata_archive_sample

Creators of such sites (including me) tend to focus too much on the number of torrents, no matter if they are active or dead. Regular users are interested mainly in active torrents.

Plus they want to see the current number of seeders/leechers, which is very difficult to keep up-to-date for a large database.

Plus they want to see a torrent creation/upload date, which you cannot get from DHT (you can record the day you found a torrent, but it will work only for newer torrents, not for historical ones).

1 comments

retonato 1331 days ago

Of course, you can just provide a code for users to run on their own computers, but don't expect that anyone will really use it (maybe just a few people here and there, I really mean it). Everyone, who is really hardcore enough to run something on their computers to obtain torrents will just use Jackett (https://github.com/Jackett/Jackett). It can search through the huge number of torrents, which no local DHT scraping/search engine can provide.

link