|
|
|
|
|
by randomstring
1983 days ago
|
|
I would estimate that greater than 90% of that traffic is bot traffic. Having run two web search engines in the past: search.netscape.com (pre-google) and blekko.com. Robots accounted for > 80% of traffic at Netscape (around 3M searches/day in 2000 IIRC) and definitely more than 80% at blekko. Maybe 90% or more. Some traffic is obviously bot traffic (single source IP, common patterns, obvious bot useragents) and then there's the non-obvious bot traffic that is random-ish, but in aggregate is clearly bot traffic. For instance, way too many queries matching the pattern "(mortgage|home loans) (zip|county|city|state)" even if they are coming from random IPs and user agents. At blekko, under high traffic, we would loadshed obvious bot traffic first and prioritize searches from humans. |
|
Would syndication search services, like DDG or Startpage, see a higher level of bot traffic than a crawler search engine? We don't know and it could depend on several factors. Bot traffic handling is certainly an ongoing and important challenge.