|
|
|
|
|
by johndavi
4463 days ago
|
|
The long tail is tough, but rules are useful when you only need to work with a small number of sites. And assuming, as you point out, less "modern" sites. (News sites tend to be mostly consistently manageable but, yes, smaller e-commerce players tend to adopt more modern techniques -- as befitting fashion-forward product lines, naturally). Our (Diffbot) approach is to learn what news and product (and other) pages look like, and obviate the rules-management -- we also fully execute JS when rendering. The web keeps evolving though, dang it. Tricky thing! |
|