Hacker News new | ask | show | jobs
by thrdbndndn 1353 days ago
Probably totally digress, but I wish IA can organize their digital library slightly better.

One day I was checking some manga books by ISBN on IA just out of curiosity. And for some reason, it put the ISBNs for all the volumes of a manga into one single entry (https://archive.org/details/isbn_1919979003907, check "ISBN" metadata section) and unsurprisingly, the actual content is only one volume, vol.43 (not even vol.1!). I have a feeling other volumes may exist somewhere there, but there is no way to search for them.

This isn't a one-off occurrence either, it reflects my experience for trying to find specific item there well, especially for non-English books.

4 comments

On a given day I'm moving tens of thousands of items around to make them easier to find. I'm sure I'll get to your section sooner or later.
Are you involved with IA? I'm actually really interested in what your day to day looks like, could you share more?
Jason's day-to-day is pretty well covered in his Twitter account: https://twitter.com/textfiles
textfiles is Jason Scott[0]

[0] https://en.wikipedia.org/wiki/Jason_Scott

And since we are there, K. Savetz (submitter) is "manager of special collections at Internet Archive".
Thank you for your service.
Every day is a joy.
A lot of the time the metadata accuracy is up to the original uploader. IA's upload system doesn't magically fill in all the metadata details for an item.
Also doesn't allow other to update metadata or even submit for review.

Wikidata has a property for Internet Archive ID, so it wouldn't be conceptually hard to construct a parallel metadata store there, but it would involve hundreds of millions of triples so it's definitely "hard" in other senses.

While I also wish the Archive to be more precise - e.g. in the "Author" and in the "Year of publication" fields -,

I suggest that you check their RSS feeds to see how staggeringly high the rate of uploads is. That uploading is "frenetic" (in a good way of course) reveals where the focus is. For re-assessing and fixing the records a parallel team would probably be needed.

I would gladly help towards that: I never checked but maybe one can volunteer.

I agree. I had wondered how successful and easy it would be to create a "front end" site that does a better job of searching, organizing archive.org.