| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bbrazil 3615 days ago

> stream are the superior approach in my point of view as they allow for realtime approaches and stream (window based) analytics.

I'd see them as slightly different approaches to providing fundamentally the same solution. One builds up time series and then operates on them, the other operates on the time series as they come in.

Taking Prometheus as an example we're a time series database, and you can do both realtime and window-based analysis. In fact that's how it is usually used.

> I would say that SignalFX is the most sophisticated

Do you have an example of something that you can do with your streaming approach that's not possible with other tools?

It's hard to get a proper understanding of the myriad of monitoring systems out there, so I'm always looking for insights.

> Our agent automatically discovers all the components and dependencies and adds them to the graph in realtime.

That sounds interesting, how do you do that for network dependencies? Do you have something like Zipkin?

1 comments

de107549 3614 days ago

I agree that streaming and timeseries queries/scans are two different approaches which can solve the problem in the same way. With instant vectors of Prometheus queries you can operate very similar to windows and if you do the right queries and take care that it works in-memory you also should get similar performance and throughput.

My point was more about the framework you get and how easy it is to apply analytics to streams/queries. SignalFx seems to have a nice workbench for this with direct visual feedback in the UI, so that you can work on existing data to get the right result.

As said we at Instana think that most people will not be able to build a sophisticated monitoring solution with these types of frameworks as they don't have the time to do it and maybe even not the analytical domain knowledge. You can see that SignalFx is adding specific knowledge for some technologies. I give you two simple examples to show that it is not easy:

- How would you predict if a file system is running out of disk space?

- How would you predict if you should add a node to a Cassandra cluster because it is running out of capacity (and it can take some serious time to add a node, so you should know in advance)?

Already the disk space problem is hard to solve - linear regression and basic algorithms will not work.

Now think of hundreds (or thousands) of services running on a dynamic container platform and new services released on a daily or even minute basis - with lots of different technologies involved...

No question that you can build a good monitoring solution with Prometheus, SignalFX, DataDog etc - but it will take a serious amount of time, consulting and dev teams involved adding the right instrumentation, metrics etc. And you need a lot of analytical knowledge. I can even imagine that there are situation were tools like Prometheus are a better choice - especially if you have a very strict set of technologies and communication framework and really good people to do a very specific set of "rules" for this environment.

We've added a domain model to our product (all the mentioned product have a generic metric model, but no semantics that describe servers, containers, processes, services and their communication which is the domain of system and application monitoring): Our Dynamic Graph.

And yes, we are using something very similar to Zipkin to get the dependencies between services. Here a are two blog entries describing the approach:

- About distributed tracing: https://www.instana.com/blog/evolution-tracing-application-p...

- How we safely instrument code: https://www.instana.com/blog/how-instana-safely-instruments-...

Mirko

link

otterley 3613 days ago

> SignalFx seems to have a nice workbench for this with direct visual feedback in the UI, so that you can work on existing data to get the right result.

Wavefront does as well; I'd recommend you compare it for competitive analysis.

So would you say your product is in direct competition with these offerings, or do you see it more as a complement to them?

link

de107549 3613 days ago

Yes, I didn't compare to Wavefront as I have only basic insights and therefore cannot make a valid statement.

Competition depends on the uses case - if you are using a tool like SignalXF for custom metric analytics, then we are no competition as our focus is monitoring of applications and its underlying infrastructure.

We are an Application Performance Management (APM) solution and therefore compete more with tools like New Relic oder AppDynamics. Theses tools are sadly only used for troubleshooting in 90% of the cases and not for management or monitoring. They also do not work in highly dynamic and scaled environments as there "model" is too static. (which they try to fix with their analytics offerings)

This is what we want to change and were we add the whole stack to the game to analyze all the dependencies and help finding root causes quickly and monitor and predict the KPIs of your applications, services, clusters and components.

We integrate with solutions like SignalFX if needed but I have really good experience to do "dashboarding" with more business related tools like Tableau or QlikView - this also offers application owners an easier way to aggregate the monitoring data and metrics on a higher (business) level, where tools like Instana offer the instrumentation data as an input.

link