Hacker News new | ask | show | jobs
by cess11 406 days ago
Could just cut out the href-value with grep and sed or a bit of scripting, '.pdf' seems to only occur on those links.

I'd keep it simple like that until I need to do periodic comparisons, i.e. actually need scrapers and is prepared to build what's needed to automatically watch and process directories where the scrapers put the files.