Hacker News new | ask | show | jobs
by Crier1002 693 days ago
i often browse the web for fun (but in a careful way), I can totally see how changing up the CSS class names in the site’s HTML regularly would mess up a bunch of the XPATH/CSS selectors in the crawler. It’d seriously be a nightmare for me if the site owners could just flip a 'switch' and change the class names easily
1 comments

If you’re scraping for text, you can always render the page as an image and do OCR on it, worst case.