|
|
|
|
|
by matt1
905 days ago
|
|
Great site, thanks for sharing. Can you explain how you're determining how many times a paper is cited? Obviously papers include a list of references, but extracting them accurately from the PDF is difficult in my experience (two column formats, ugh) - though the new HTML versions help. And even if you have a list, many authors just mention arXiv paper titles, not their ids, making identifying specific references tricky. |
|
I just extract the titles and look for their respective ids.
The real challenge was how to do that at scale. Only in CS there are well over half a million papers