|
|
|
|
|
by logicfiction
1235 days ago
|
|
(Author here). Yes I believe you are correct with regards to tracking application utilization of say EC2 and other AWS resources. The post fails to mention this system is also tracking internal data dimensions like customer ids, such that we can also use this sampled data to estimate the cost of customers (and joining that with tiers of customers, and so forth). I'm also not sure if that would allow us to attribute the cost of our datastore utilizations since those are not AWS-hosted versions but ones we run ourselves. The traffic interception lets us be able to say that Application A is using 75% of database cluster XYZ, and therefore that application/product group are most likely responsible for that share of how much the database costs. The last thing I'll mention is that CloudTrail has the potential to be expensive on its own, I believe at least moreso than us storing the raw data in S3 for something like Athena to read. I don't think I'll be writing about it, but we've also done work this last year to trim down what we track in CloudTrail due to the cost of events (for example tracking everything in S3 ends up being pretty expensive). |
|
Is this always true? Typically the shared resources you care about are CPU, memory and disk. I would say an application issuing fewer, much heavier queries is using the shared resource more than an application that issues more really simple queries. And this doesn’t correlate much to disk usage right?
There isn’t really a good solution to this. You can use a combination of query sampling and per-app databases to correlate this better.
Great post though, this is something we’ve been dealing and experimenting with recently.