Hacker News new | ask | show | jobs
by lifeisstillgood 4205 days ago
20 million different time series. I mean that is a lot.

If you have say, 20,000 servers running that is still 1,000 different time series per server. Memory, CPU, logins, logouts, customer selections, I mean I struggle to get to those numbers.

2 comments

It's actually more like close to 1.2 billion different time series -- we report most metrics on one minute granularity, but they're not all reporting at the same second (thank God), so on average we're getting up to 20M time series per minute.

But this of course just makes the question more reasonable -- 1.2B different time series? Really?

Yup. We get a bunch of system telemetry, and a bunch of default application telemetry, without even getting traffic hitting the box, but that's a relatively small percentage of the overall volume. Developers LOVE metrics.

So imagine you want to measure requests to our API, and these are some tags you want to keep track of: request type: 5 different types result: 2 possible values (success, failure) originating country: 50 countries originating device type: 200 devices

And let's say you've got a 1000 instances reporting this data.

Suddenly you've got 5 * 2 * 50 * 200 * 1000

Oh look. Here's 100M different metrics.

And that's a relatively trivial example.

1,000 metrics per server is quite reasonable. I work for a performance management company and we handle thousands of time series metrics per monitored server at one-second resolution.
What's the rough ratio for system level, process level and app level (ie total MB, MB / process and "a customer just signed up")?

How much traffic does that add up to? It seems a lot.

above, copperlight quotes about 3%-5% of metrics being system-level. A pretty small number would be process-level, I'm guessing, with the vast majority being app-level.

At Netflix-sized, the answer to pretty much any question is "a lot." :)