Hacker News new | ask | show | jobs
by londons_explore 1815 days ago
Neevabot seems to deliberately 'misunderstand' robots txt files excluding it:

https://neeva.com/neevabot

They go crawling anywhere googlebot is allowed...

As a webmaster who explicitly allowed only Googlebot, I'm pretty annoyed to find another companies bot crawling my site too, doubling the server load (crawling consists of about 30% of my compute budget for Googlebot alone - it turns out bots have far worse cache hit rates than real users). This 'dishonest' behaviour of Neeva has cost me ~$2000 so far... Will they be refunding me that from all their subscriptions?