Hacker News new | ask | show | jobs
by roskilli 2419 days ago
Rob, co-founder and M3DB creator here, Uber collected billions of metric samples and we had tens of billions of metrics in M3 at Uber. Netflix for reference has not published any numbers higher than single digit billions of time series. The system has run in production for several years now at Uber now. That's my thoughts on the matter, hah.
1 comments

New Relic touts collecting trillions of data points per day.
According to https://eng.uber.com/m3/

> Released in 2015, M3 now houses over 6.6 billion time series. M3 aggregates 500 million metrics per second and persists 20 million resulting metrics-per-second to storage globally (with M3DB), using a quorum write to persist each metric to three replicas in a region.

So, if that's accurate, they're collecting one trillion data points every two seconds.

So we collected and aggregated more than 1 billion samples of metrics per second, which resulted in writing more than 30-40 million unique metric datapoints per second to storage. This resulted in more than 10 billion unique time series being stored (each with a very large number of distinct datapoints each).

This was 3.6 trillion metric samples per hour or 2.5 trillion metric datapoints stored a day (after aggregating samples).

No, they're collecting one BILLION (with a b) data points every two seconds. Gotta go to 2000 seconds (a little over half an hour) for the TRILLION.

With a 25:1 reduction/summarization before writing. If they're smart, they do that summarization on the way in, rather than at the back-end layer. That's a billion data points written per minute, or a trillion and a half written per day!

Oops, don't know how I misread that! Thanks for the correction!