|
|
|
|
|
by CyberDildonics
2014 days ago
|
|
Maybe it would work to put a marker argument (like the IP address as base64) in the URL when there might be snowballing traffic so you can see if it comes back at you. That could be used to send a page with all the links taken out, or just be rate limited. |
|
We do have various ways to combat these issues; like any website of sufficient size, we have pretty complex methods of detecting problematic traffic and assessing the risk of any given request or session. However, no solution is perfect, and with the number of broken crawlers we see, some will inevitably cause problems.
To be clear, we can adjust our code and block them—that’s not an issue. The issue is that I have to wake up at 3 AM to do it, and even if it’s blocked, dealing with that traffic can be expensive. This guy got his $72k bill forgiven, but don’t expect the websites on the other end to be so lucky. (Yes, yes, ingress bandwidth is often free, but it’s never that simple. Scaling up? Bezos takes a cut. More database traffic? Pay the Bezos tax. Replication of enormous logs to other providers? Bezos hungry!)
Negligence is negligence. If you get in a car and drive recklessly without proper training, even if you didn’t intend to hurt anyone, you’re not going to get a lot of sympathy when you mow down a pedestrian. Likewise, I have little sympathy for people who face enormous bills for abusing powerful tools.
That’s not to say cloud providers don’t have billing problems. The delays are unacceptable, and the budgeting tools are often unintuitive or, as was likely the case here, outright inadequate. But in no universe was deploying code that spun up a container for every URL encountered a good idea.
Should such a mistake result in a $72k bill? Eh, probably not. I doubt this person will make the same mistake again, even with the bill forgiven. Or maybe they’ll just blame Google and attempt the same thing on AWS.