|
|
|
|
|
by nsonha
2059 days ago
|
|
a lot of this selectorlib (get text or attributes) is achievable with xpath 1.0, which is built-in browsers and testing tools. What I do in my scrapping framework is that it takes a dict of name -> xpath and return a json object. This way the framework knows exactly what need to be etracted and stop loading the page as soon as all information are collected. |
|