Hacker News new | ask | show | jobs
by anonymousab 377 days ago
I remember going out to dinner, years ago, with a fairly senior AWS billing engineer. An acquaintance of a coworker.

He looked completely surprised when I asked about runaway billing and why there wasn't any simple options to cap a given resource to prevent those cases.

His response was that they didn't build that because none of their customers wanted anything like that, as far as he was aware.

4 comments

Disclaimer: I work at Google but not on cloud. Opinions my own.

I think the reason this doesn’t get prioritized is that large customers don’t actually want a “stop serving if I pass this limit” amount. If there’s a spike in traffic, they probably would rather pay the money to serve it. The customers that would want this feature are small-dollar customers, and from an economic perspective it makes less sense to prioritize this feature, since they’re not spending very much relative to customers who wouldn’t want this feature.

Maybe if there weren’t more feature requests to get prioritized this might happen, but the reality is that there are always more feature requests than time to implement them, and a feature request used almost exclusively by the smallest dollar customers will always lose to a feature for big-dollar customers.

I guess where it could potentially bring value is by:

Removing a major concern that prevents individuals / small customers from using GCP in the first place; so more of them do use it

That could then lead to value in two ways:

- They make small projects that go on to be large projects later, (e.g. a small app that grows / becomes successful, becomes a moneymaker)

- Or, they might then be more inclined to get their big corp to use GCP later on, if they've already been using it as an individual

But that's long term, and hard to measure / put a number on

Every large enterprise has insurmountable difficult even imagining why customers would want something as bizarre as a "stop loss" on their spending...

... right up until it's their own bottom line that is at risk, and then like magic spending limits become a critical feature.

For example, Azure has no stop-loss feature for paid customers, but it does for the "free" Visual Studio subscriber credits. Because if some random dev with a VS subscription blows through $100K of GPU time due to a missing spending constraint, that's Microsoft's problem, not their own.

It's as simple as that.

As noted above, there is enough value here such that AWS implemented this several years ago. Said implementation is appropriate for both personal AWS accounts and large scale multi-account organizations.

Having implemented this on behalf of others several times, I'll share the common pain points: * There's a long lead time. You need to enable Cost Explorer (24-48 hours). If you're trying for fine distinctions, activating tags as cost allocation tags is another 24 hours * AWS cost data is a lagging indicator, so you need to be able to absorb a day of charges * Automation support is poor, especially for organizations * Organization budgets configured at the account level are misleading if you don't understand how they're configured

What's really wanted here is that AWS needs to commit to more timely cost data delivery such that you can create an hourly budget with an associated action.

> Said implementation is appropriate for both personal AWS accounts and large scale multi-account organizations.

Followed by a list of caveats that make it wholly irrelevant for an individual who is afraid of a surprise charge covering less than several days.

Yeah, right. Capping a resource, such a wild idea. Of course they won't implement it for the same reason bar owners don't put a cap on drinks.
Aren't bars actually required to cap drinks? It's usually phrased as having to refuse serving if you're visibly drunk, but still effectively a cap. That said, a big cloud bill doesn't make you intoxicated. The more I examine this analogy, the less it makes sense.
I don't know if the analogy works that well, the assumption is that you're making more money then you put in the more traffic you get. As a bar owner is the choice between closing your bar for the month when you run out of beer or running to the supplier to bring more kegs.
I'm sure lot of people at Amazon and Google are aware small customers want this and it's a feature they'd like to brag about, but it is much harder to implement a real time quota on spend than a daily batched job for the money part + realtime resource scoped quotas.
None of their Big Customers they meant, the small ones who worry about this doesn't matter.