|
|
|
|
|
by ninjin
40 days ago
|
|
I can report that Facebook does not respect robots.txt. Heck, I even mailed domain@fb.com with the specific IP ranges and log samples three times over a month and they of did not even respond. Keeps on wasting my CPU cycles to this day by crawling massive development forks (I hope they choke on the data...): $ (cat /var/www/logs/access.log; zcat /var/www/logs/access.log*.gz) | grep 2a03:2880: | wc -l
626396
About three hits per second for months now. |
|