Hacker News new | ask | show | jobs
by Booktrope 1582 days ago
Still the supreme source of free public domain ebooks

https://www.gutenberg.org/

1 comments

The quality of transcription on Gutenberg is rough, especially for older transcriptions. Standard eBooks are much higher quality, but selection is limited because of the effort gap.

I processed two books from Gutenberg for SE (Devil's Dictionary and a smaller scifi novel) and both were quite a bit of work to bang the books into shape (half the work was metadata enhancement, half was proofreading and correcting)

EDIT: After comparing, it's definitely just the raw Gutenburg scan w/formatting. You can see a big batch of fixed typos that weren't applied here: https://github.com/standardebooks/ambrose-bierce_the-devils-...

Perhaps, but Gutenberg is adding a huge amount of good quality content as of late, thanks to their Distributed Proofreaders community. Older content will soon be a small fraction of the total, and much of it will be picked up and updated to current standards.
The DP stuff is better, for sure. The first book I did was DP and it was far less 'buggy' than the older one. The issues were mostly formatting.