|
|
|
|
|
by jawerty
1409 days ago
|
|
I’ve built a lot of tools utilizing web scraping most recently https://GitHub.com/Jawerty/myAlgorithm and https://metaheads.xyz I think the more control you have over the tools the better if you know your way around css selectors and selenium you can do anything web scraping. Selenium can seem hefty but there are plenty of ways to optimize for resource intensity; look up selenium grid. Overall, don’t be afraid of browser automation you can Always find a way to optimize. The real difficulty is freshness of html. This you can fix by being smart about time stamps and caching. If you have the same data you’re scraping consistently…don’t do that. Also if there’s a frontend in your application dependent on scraped data NEVER use your scraping routines as a direct feed, store data whenever you scrape. |
|