Hacker News new | ask | show | jobs
by input_sh 1004 days ago
Mirroring libgen is definitely within reach, it's "just" 50 or so terabytes with torrents freely available for bulk downloading.

Realistically only maybe 10% of that is actually useful, but reaching that 10% is gonna be very labour-intensive. You would have to do a lot of cleanup of different formats, duplicate uploads, different editions of the same book, scanned PDFs, and what not, while big players with their own ebook stores (Amazon, Google, Apple, any ebook store) already have all of the proper metadata, a common format to work with, and a lot less duplicates.

1 comments

Isn't there some kind of standard for publication metadata? The one which will allow to uniquely identify publication + further track different editions as children of "original" publication? Maybe we should create one and make it freely available?