Hacker News new | ask | show | jobs
by opsunit 1962 days ago
Wavefront brings a number of things to the table that aren't core competencies we wish to maintain in-house.

I know it can scale to massive volumes without interaction from us.

I know it'll be available when our infrastructure isn't. By being a third party we can be confident that any action on our part (such as rolling an SCP out to an AWS org, despite unit tests) won't impact the observability we rely on to tell us we've screwed that up.

I can plug 100s of AWS accounts and 10s of payers into it and I don't have to think about that in terms of making self-hosted infrastructure available via PrivateLinks or some other such complication.

I pay mid six-figure sums annually for these things to "just work". If you folks believe I can achieve this functionality on a per-seat basis I'd be interested in saving those six figures.

2 comments

We’re building Opstrace to be as simple as a provider like Wavefront -- we’ve failed if you need additional competencies to manage it. That being said, we’re early in our journey and still have a ways to go.

As mentioned in the original post here, at the core of Opstrace is Cortex (https://cortexproject.io). We know that Cortex scales well to hundreds of millions of unique active metrics, so depending on the exact characteristics of your workload, the fundamentals should be there.

However, Cortex is a serious service to run and if you were to DIY it would require operations work that you currently don’t have with Wavefront. This is the problem we’re trying to solve—making these great OSS solutions easier to use for people like you.

Opstrace is made to be exposed on the internet (which is optional of course), so you can easily run it in an isolated account to keep it safe from all other operations. And in fact, this is the configuration we recommend for production use.

Regarding “100s of AWS accounts and 10s of payers”... does that include any form of multi-tenant isolation? We support multi-tenancy out of the box to enable controlling rate limits and authorization limits for different groups. We’d need to talk in more detail about that. If you’d like to do that privately, please shoot me at chris@opstrace.com. We’re of course happy to continue the discussion here with you as well.

As a heads up, I think you meant to link https://cortexmetrics.io/
Thanks for the correction! You linked to the right Cortex, not to be confused with https://github.com/TheHive-Project/Cortex, haha. https://github.com/cortexproject/cortex is what we talk about. Naming is hard.
:facepalm: Yes, indeed, I conflated the website and the GitHub org. Mea culpa.
JP from Opstrace here.

Thanks for sharing this perspective, stressing the relative value of predictability.

Of course, when things go pear-shaped the last thing you want to discover is that your monitoring pipeline doesn't work as expected. We feel you.

Your skepticism is justified and I'm super happy to see that here. We know that our future users are (and should be) quite demanding with respect to robustness of the platform.

We're not naively assuming that it's easy to build a platform that is highly available, auto-scaling, and generally worry-free.

In fact, based on our experience, we really know that we'll have to invest an incredible amount of engineering effort in order to make things super reliable and predictable. On the other hand, by making some smart decisions we can get far with little effort. We have super strong building blocks that we can rely on (such as using a cloud-provided database for storing critical configuration state).

> If you folks believe I can achieve this functionality on a per-seat basis I'd be interested in saving those six figures.

The bet is on, but of course we need a bit of time :)