Hacker News new | ask | show | jobs
by m1keil 1407 days ago
GuardDuty is another example of brilliance of AWS pricing scheme and how they manage to twist your hand to pay extra which can cost quite a lot in the end of the month.

When comparing EC2 to servers, nobody adds the added premiums of the extras. Things like CloudTrail, Support, GuardDuty, CloudWatch. All of these things have a variable cost that grows with usage and very hard to predict ahead of time.

Just last week I discovered our GuardDuty bills went up from $15/month to $400/month. Inspecting closer, the issue was a small script that did AWS API call at a tight while loop in a couple of EC2 instances.

So if you choose to enable GD, make sure to have your monitoring in place, gradually enable GD across the infra and establish clear baselines and alerting in place for costs.

8 comments

Its quite annoying that you can't disable parts of guardduty you don't get much value from. I think the CloudTrail monitoring is quite useful and the VPC flowlog monitoring is basically useless (to me, I have other means of doing host and network based monitoring, I'm sure that there are a lot of people who get meaningful value out of it). I'd like to be able to turn off flowlog monitoring and just use guardduty for cloudtrail monitoring, but that isn't an option. So I can either overpay for a bunch of extra things I don't get any value from or not enable guardduty at all.
Until you realize how much the alternatives cost. You think Palo Alto networks is cheap?
At least you can get a fixed cost if you go that route vs. your bills growing at an exponential rate. From what I have found, PA really wants you to commit to annual pricing with their VM appliances. If you go hourly you pay 73% more.
Is there a feature and price comparison between the native AWS offering and Palo Alto somewhere? Would be super handy for one of our projects right now. Thanks!
Do you think Palo Alto's offerings and GuardDuty are comparable?
The biggest “hidden” cost for me was Elastic Blockstore IO charges! It started adding up quite quickly and I realized I had to think twice about doing IO intensive calculations on EC2! I switched over to lightsail (obviously these are for personal projects).
Just use gp3, shouldn't be needed to request higher IO with it (except for DBs, probably cheaper to use DynamoDB etc then).
Yeah, provisioned IOPS can add up quick. Never had justification to spend on them myself. If I need IO I use instances with attached storage, otherwise I purely use the GP SSDs.
Managing and occasionally slashing costs is why AWS consultants can make really good money - if you're into this kinda thing, it's worthwhile learning how to analyze and optimize AWS bills.
Well, before AWS big companies tended to have the same problem, but instead of paying for many AWS features they paid for a thousand different products and trying to make sense of all those bills for each product was hard. I know some people who worked as consultants to clean up all the services the company was paying for but didn't really need any longer, paying for around thousand different software products was the norm and not some unrealistically large number.
how did you debug it? We have this weird thing where the cost of GuardDuty varies on different days of the month and stays the same from month to month. But we can't figure out what causes them. Can you actually see the events that GuardDuty has processed?
Just make sure everything is multi-az that way if an entire zone goes down you will still be fine /s
I hear you but how is this any different than any alternative approaches? I mean yes you could throw a server under your desk and it would be cheaper. Should you run a server in prod with sensitive workloads without all these extra security bells and whistles (on prem or in the cloud)? No. No you should not. Compared to alternative security appliances and offerings this is still a good deal.
> Should you run a server in prod with sensitive workloads without all these extra security bells and whistles (on prem or in the cloud)? No. No you should not.

Why not? Have people forgotten how to run servers in the past decade?

People haven’t forgotten, companies just don’t hire infra people.

“Fire all your ops people do the cloud” has made it so that organizations have forgotten how to run servers.

Working in infra but not at supermassive scale does really feel like the new COBOL programmer.

Na. At some point you scale to the point that $400/m is peanuts for the benefits you get.

We used it at previous job and realised we were under constant attack and and it reduced our cou usage by 15% in reduced requests for the amount of traffic we were getting. No more random spikes.

Improved our overall security and ultimately reduced costs on the AWS bill.

But if you’re gonna switch it on and walk away then you’re not really using it.

Yes of course, $400/month is peanuts so does $8000/month for certain companies.

The point is not the absolute sum but how easy is it to spike your bill by 30 times.

The way GD works, you are pretty much guaranteed to overpay for it sooner or later. If you can afford it.. great.

One thing that we noticed was after switching it on, the EC2 instances were being hit directly, so we moved those into a private security group only accessible to the load balancer. RDS got restricted. S3 buckets fixed. Coupled with AWF to block on inregular activity, resulted in GD bill going down, not up.

This is no different from programming. PHP has some awful code out in the wild, it doesn't mean PHP is shit just because people write bad code.

The issue with AWS is it's far too easy for people to just spin stuff up and it works and they don't look at what they are being billed for, don't analysis their infrastructure, don't optimize. They just throw servers, containers, etc up into the wild then when the bill comes:

"OH AWS BAD I got billed cos I just set it up and forgot about it, then when it worked they charged me for it, AWS is wrong, just go baremetal."

I think it is similarly easy to spin it the other way around. "AWS is just selling you the gun and the bullets, you are the one who is shooting yourself in the foot".

I don't think I said AWS is shit or that GD is worthless, after all, I use both by choice. Yet, I do not think that AWS are blameless when it comes to certain decisions of how to bill, how to present data and how to document some of their features.

For example, in order to discover something is wrong with your GD billing, you must have CloudTrail in place, and the appropriate infrastructure to query it. And even tho AWS can easily alert you about weird trend in your API calls (like suspiciously high Describe*), they won't do it. They do it with Trusted Advisor when you have under-utilised EC2 instances (which requires Business+ support plan per account).

Someone mentioned in the thread the need for SCP in order to disable regions. Why should you have go all the route to SCP? Why can't we disable regions by click of a button under root account like it's possible for some of the latest regions?

Is something inherently wrong in it and pure evil? No. But I think the defaults can be better. I think AWS can improve their customer's default posture when it comes to Audit and Security without the need to have to decide between 10 different services with different billing plans and gotchas.

Have you checked out Cost Anomaly Detection[1]? It builds an ML model to alert on anomalous usage and resulting changes in billing.

[1] https://aws.amazon.com/aws-cost-management/aws-cost-anomaly-...

That's exactly my point, that is _yet another_ service you need to go through to get a clear picture of what is going on.

These products have their place, but they don't make sense until you reach a certain size.

Out of curiosity, did you have any data lakes on S3? Did you find optimization techniques for the same?
Nope, but did realise we had some open buckets we didn't realise were open. Thankfully we didn't store sensitive information in there despite having 2PB of files in there.