Interesting, tools like Zotero seem to have sorted out the pdf fetching (and metadata + abstract fetching even without institutional access to the pdf). Did you try building the fetching on top of that?
I meant for point 1. Zotero will accept a doi/arxiv link (among other) and download the public metadata (authors, journal, abstract) for you so you don't need to build something for that end. AI cites a paper, copy DOI into Zotero, analyze info Zotero returns.