AWS offers alarms and actions that will stop or terminate your service based on budget thresholds to solve exactly this problem. Why aren't these sufficient?
I don't have access to the dashboard because I cancelled my AWS account rather than try to spend more time finding things that still needed to be unconfigured/disabled to get them to stop billing me (the last straw was going to the spend analyzer and it telling me it'd take 24 hours to see what is still costing me money) but:
Can I terminate based on cost? Like "I have spent $1,000 this month in AWS, something has gone wrong just kill everything" (or at least runaway service buckets) or is it just "oh I forgot to terminate this particular EC2 instance once I was done with it it'd be nice if I could just set those rules up in advance"?
Step 1. Go to billing and create a monthly budget. Mine is $100.
Step 2. Create an alert: First alert is I get an email when it exceeds 80% ($80) total AWS costs.
Step 3. Create an action: I only have a single EC2 instance running a webserver that is always on. If my threshold is exceeded (say, a million people start downloading my pictures and my IO-OUT spikes), my action stops my EC2 inst via an IAM role action. Boom. Server goes offline instantly, without having to log in (like if I'm sleeping, or drunk).
Done.
Sometimes I get an alert because my usual cost is $35/mo and if a few domain renewals pop up that month, it will spike to $80. Hence the alert at $80 and action at $100 threshold.
And I can use any kind of metric: IO bandwidth from downloads, RDS bandwidth for too many queries, if I had elastic instances, limit the # based on cost. It is completely flexible. You can terminate too, but I only have one, I don't use elastic pools to dynamically allocate.
I don't get all the fuss, it is quite a simple service. Maybe it doesn't scale well for huge operations and that is the problem cuz i'm not a power user or company?
Is step 3 literally "create an upfront action for the single EC2 service I have configured"? If so yes then the problem is scaling, 1 thing by 1 person created in 1 day in AWS is pretty easy to manage even without this feature but 1,000 things across many service buckets where 1 is something like a runaway suspended machine in a region you can't find when it should have been terminated and you don't know what needs to be terminated you just can't click a button and see it rather you have to go down each breadcrumb trail of billing buckets that look odd and jump between portions of the interface trying to cross track it is is where it turns into a disaster.
On the corporate side it's a project where a team tries to go through everything and hopefully people have stayed in their lane on things they configured in AWS so the SMEs can just check their stuff and find it quickly. On the personal side it's a lamentation there isn't just a "nuke all" button beyond permanently disabling your account completely.
It can be a nuke or a surgical scalpel, e.g., contour traffic rather than taking down your entire site. And it is scriptable: any IAM role can be programmed into an action.
> If so yes then the problem is scaling,
Come on, man: you can't bash AWS if you don't even know how it works!
I'm addressing all these sob-stories of poor college students suddenly getting hit with $1000 bills for using lamda the wrong way, not a Series B startup with $5MM in the bank 20 employees and a billion CPM on their webapp.
I don't mean can you scale it down granularly or stop the service completely I mean can you say "when $100 disable everything in this AWS account that will generate billing without having to specify each thing individually in a rule". Snapshots, backups, IPs, instances, etc". It's not a matter of knowing how these things work it's a matter of finding what you're going to be billed for tomorrow because it is currently running - that's what's hard.
> Come on, man: you can't bash AWS if you don't even know how it works!
It is possible to understand how AWS works and still run into problems trying to scale AWS billing. This may not be apparent in a single ec2 instance setup but that doesn't mean the reason you see the complaints so often is everyone else are just idiots.
In my case I didn't lose 1000s or anything on my personal accounts more like 40 bucks by the time I just closed the account rather than wait 24 hours to track down the last thing in spend analyzer. It was a precanned product demo script for a cloud security product, first install went wrong and needed to be cleaned up manually but it was hard to tell what actually ended up staying vs not, especially since I didn't define the architecture ground up manually.
Note this is separate from "I didn't know if I clicked create 1000 GPU training instances it would cost a lot" though that would also be covered by an upfront monthly limit too I suppose.
Alternatively: imagine how quickly the UI would be fixed if the difficulty in finding how to create a new billable service were switched with the difficulty of finding which billable service is causing overruns.
Can I terminate based on cost? Like "I have spent $1,000 this month in AWS, something has gone wrong just kill everything" (or at least runaway service buckets) or is it just "oh I forgot to terminate this particular EC2 instance once I was done with it it'd be nice if I could just set those rules up in advance"?