|
|
|
|
|
by zarzavat
1411 days ago
|
|
Python is my work horse, if I need to scrape something from a site that is relaxed about scraping (most are). I have my own library of helper functions I've built up over the years. In simple cases I just regex out what I need, if I need a full DOM then I use JSDOM/node. For sites that are "difficult" I remote control a real browser, GUI and all. I don't use Chrome headless because if there's e.g. a captcha I want to be able to fill it in manually. |
|