Hacker News new | ask | show | jobs
by xaa 4214 days ago
The data needed for searching (titles, abstracts, and other metadata), at least for biology-related articles, is already available in an XML dump from NIH. The public interface is called PubMed, so you can already search for Nature articles, just not within full text.

I have actually tried to do what you are suggesting before, so I could bulk-download PDFs (I have an institutional subscription). The problem is that the URLs are different for every journal, and some have Javascript that makes automated downloading hard. Since there are thousands of journals, it's very tedious to try to invent a generalized solution.