|
|
|
|
|
by xarball
2846 days ago
|
|
Why would you switch from selenium to beautiful soup halfway through what you're trying to do, and force your program to re-request the same information from the web server? Selenium has access to the entire DOM, and the entire JavaScript session already loaded in a running web browser. It has way more power for data mining than beautiful soup does. It looks like they're just trying to use selectors, but these directions seem to completely miss that functionality in Selenium's API. Just search the WebDriver documentation for 'find_element_by_': https://selenium-python.readthedocs.io/api.html I use Selenium for all my web crawling, exactly because I would rather have one crawler with all the backing support of a modern web browser, than corner myself into not having something as crucial as a JavaScript parser halfway through implementing a bot that's designed to hook what's basically an end-user interface sitting on top of all that. The most obvious benefit of Selenium to me, is that by having all that, I can make my interactions with a web server look more like a user, and fly under the radar a little more. This tends to require less work on my part when I treat websites more like a whole package (though more RAM, yes!) |
|