|
|
|
|
|
by bnewbold
1501 days ago
|
|
Hi, i'm the maintainer of scholar.archive.org. It looks like we are missing a bunch of your public papers, such as those published here:
http://park.itc.u-tokyo.ac.jp/eigo/publication_en.html Both unpaywall and scholar.archive.org work best with papers that have persistent identifiers like DOIs, PMIDs, DOAJ article ids, or dblb records. Unpaywall currently works with Crossref DOIs exclusively. With scholar.archive.org (and fatcat.wiki, the backing catalog), it is possible to submit metadata records directly, but it can be laborious and would be better for everybody if this process was automated. Processing OAI-PMH feeds or extracting bibliographic metadata from HTML metadata would probably improve our coverage, and we are hoping to roll out that kind of scraping eventually. But it has been a challenge to clean and de-duplicate metadata at that scale. |
|
Just for reference for anyone else reading this, here is an excerpt from an e-mail I sent you in March 2021, after IA Scholar was first mentioned on HN:
“I contacted the people at [a large Japanese academic library]. ... I showed them your HN post [1] about the data you've already collected through J-STAGE, and, contrary to my own impression, they said you have probably already captured most of the metadata for Japanese academic journals that would be easily available. They also pointed out that J-STAGE includes a fair amount of publications from the humanities side of things, also contrary to my own impression.
“The main sticking point, they said, is journals that are published by universities or academic societies and have not been listed on J-STAGE. Many of those journals have never been digitized, they said, and those that are available in digital form are likely to be available only on those universities’ or societies’ individual websites. The library people didn’t know of any aggregators or indexes for such sites. The only way to find them, they suggested, would be for someone to hunt for the sites by hand.
“Over the years, I myself have been involved with the publication of several such journals and have set up websites for a couple, too. The ones published by departments at [a particular Japanese university] are included in [the university’s online repository] but not yet, it seems, on J-STAGE. A couple published by small academic societies are available only on those societies' websites. [Addendum: The Japanese academic societies I have been involved with—mostly in the humanities—would have difficulty getting DOIs or other persistent identifies for the papers they publish; it would take some effort even to convince them of the necessity. They are volunteer-run organizations, and just maintaining their websites is often a challenge for them.]
“Yet another impression of mine (also perhaps wrong) is that a higher percentage of academic research in Japan is published through such journals than in the U.S. It would be very valuable to have that research findable through IA Scholar, but the barriers to collecting it seem high.”
[1] https://news.ycombinator.com/item?id=26408897