Hacker News new | ask | show | jobs
by GauntletWizard 845 days ago
I worked on Scuba, inside and outside of Meta (Interana), and yeah - It was expensive AF. I recommend focusing on metrics first. Use analytics logging sparingly, and understand the statistics of how metrics work, because without understanding those statistics you'll misread your events anyway.

This is not to say that wide events aren't worth it - For many things, something like Scuba or Bigquery are invaluable. There's ways to optimize. But we're talking about "One of AWS's largest machines" vs "A couple cores", and I suggest learning Prometheus first.

1 comments

> understand the statistics of how metrics work

Haha, since you worked on Scuba I’ll mention IMO this point was by far the biggest flaw of ODS. No one ever performed the metric rollups correctly. Average of averages? And at what granularity? ODS downsampled the older time series data but now perhaps you’re taking a percentile over a “max of maxes”. Except it only sometimes used that method of downsampling automatically.

And I seem to recall the labels “daily”, “weekly”, and “monthly” not being intuitive either, and two of them meant the same thing... that was quite a mess to work with.

A lot of the autoscaling systems were wonky because the ODS metrics they were based upon didn’t represent what people thought they did.

Never in the world I would have expected my post to cause the discussion about ODS flaws :D