Hacker News new | ask | show | jobs
by wayathr0w 986 days ago
If you liked the comment-length analysis OCLC & want more, there's a whole essay on the subject. [1]

>But one of the ironies of the scraping is that it's not going to be immediately helpful to the libraries who are unable to afford to participate in Worldcat. This is because the scrape didn't (and quite possibly never could have) capture the data in MARC format, which is what most library catalog software uses. While MARC records could be cross-walked from the JSON, they will undoubtedly omit some data elements found in the original MARC.

While it would have been ideal to get all the data in MARC & as many other formats as possible, I wonder how true this is worldwide - many libraries don't use MARC or have a digital catalog at all. Maybe there are some ways the data could be processed that make it easier to integrate into such places, but of course local needs/desires will vary widely.

[1] https://core.ac.uk/download/pdf/11883899.pdf - it was also published in this book: https://archive.org/details/radicalcatalogin0000unse

1 comments

> While it would have been ideal to get all the data in MARC & as many other formats as possible, I wonder how true this is worldwide - many libraries don't use MARC or have a digital catalog at all. Maybe there are some ways the data could be processed that make it easier to integrate into such places, but of course local needs/desires will vary widely.

Indeed, MARC is not universal (and for that matter, it wouldn't surprise me if at this point the majority of records in Worldcat were _not_ derived from MARC sources), and there are certainly non-MARC library catalog platforms out there. That said, as the growth of Koha shows, for better or worse MARC has become a close to a global baseline for a lot of libraries.

Worse, definitely worse.