|
|
|
|
|
by sourcecodeplz
816 days ago
|
|
They crawl all the time, their instances could go down and no problem, there are still hundreds doing the same task. They consume waaaay too much traffic for the cloud to make sense financially. Hybrid approach is best in cases like this. Use the cloud for client facing interfaces and rent dedicated servers for the spiders. edit: even better, build your own data center instead of renting. |
|
8 cores, 32 GB RAM, 2x 500 GB SSD for ~€40/month — it's an older CPU but web crawlers don't spend too much time crunching numbers anyway.