|
|
|
|
|
by AznHisoka
2463 days ago
|
|
I've been running many non-JS crawlers for the past few years, and there were a few pages that kept pushing the CPU load of my servers to a halt. When I dug into the source code, I saw that the HTML was a convoluted text of tables inside tables inside tables inside more tables, thus making it incredibly time-consuming + CPU-intensive for my DOM parser to parse (I was using Nokogiri, a Ruby gem at the time). Thus Cloudflare could be serving these types of "fake" pages to bad bots. They could also be doing things like serving fake streaming audio that never ends, or anything that might make it seem like the web page is just a huge page that needs time to load. |
|