Show HN: Squzy – open-source monitoring, incident and alert system

Y	Hacker News new \| ask \| show \| jobs

	Show HN: Squzy – open-source monitoring, incident and alert system (github.com)
	46 points by PyxRu94 2157 days ago

2 comments

gravypod 2156 days ago

This looks pretty neat. One thing that would be really cool would be support for event logging of protobuf messages. Some way to provide squizy a protobuf bin file that describes all of my messages or something and then a way to generically send you a bunch of protos and have your api serialzie and allow queries and notifications based on these messages.

One thing we have at work is something that's like this:

    message Event {
        enum type {
            UserLoginFailed login_failed
            SpecificApiRequestMade api_request
            ....
        }
    }

Right now we're turning these protos into json and serializing it into a mongodb for easy queries. This way we can do things like "COUNT(*) GROUPED BY login_failed.username" and find accoutns that are being targeted by bots, for example.

link

PyxRu94 2156 days ago

Thank for feedback, I think we already support that for golang, here example: https://github.com/squzy/test_tracing

That example on dashboard: https://demo.squzy.app/transactions/Y_jLC4hlwirv0PYqvSVG5

link

PyxRu94 2156 days ago

It is means you can create custom transaction in specific cases, and you can group by them on that page: https://demo.squzy.app/applications/5eef71dcaac3ab3dc67a4ef3...

link

hanfsi 2156 days ago

Puh that looks very crude.

The dashboard gives you a very bad overview. Its not even clear with one look what you are looking at.

And then as a kicker, not a timeseries based database.

You should have a look at how prometheus is doing it.

link

bglusman 2156 days ago

Seems a little harsh! Prometheus is a big project thats been around a long time, and I'm not expert in it, but I don't think it's aspiring to do incident notification or APM, is it? I think it's just metrics. Maybe more constructive would be to provide some specific things that were unclear or confusing to you in the overview, and/or to suggest that they integrate with Prometheus for the things it's already excellent at and avoid reinventing them? Dunno, just my $0.02, but they're both go projects and it looks decent at a glance to me, probably just using material I think?

Anyway, I think open source stuff sometimes needs constructive criticism but should always be appreciated first as a contribution to the ecosystem even if you're not personally planning to use it.

link

cheald 2156 days ago

Prometheus has AlertManager which provides a framework for incident notification (we route incidents to Mattermost and PagerDuty, for example; PD ends up being our big incident response tool, which lets us cascade into a variety of "wake the sysadmin up" methods). It doesn't do APM, but it wouldn't be difficult to expose a Prometheus agent for your APM (just like you'd expose metrics for anything else you want to monitor).

I appreciate new tools, but I do think it's fair to ask what it does better than the existing tools. Prometheus' biggest problem is its learning curve, IMO, so there might be some gains to be made there, but after using it, I think the learning curve is a function of its architecture, which is a large part of what makes it so resilient. If it can be improved while maintaining (or improving on) resilience, awesome, but I personally know that I won't sleep well at night if my monitoring service isn't rock-solid.

link

PyxRu94 2156 days ago

Prometheus not work with transaction it is just tools for save metric, I think we will have integration for that too, but right now our plan to improve current system.

About dashboard: we fully agree with you, but for that we need some more experience in UI/ux and design. Also we are not so big as, for example, datadog. But we will be glad to improve it and to hear the suggestions.

link

hanfsi 2156 days ago

The problem is, that transactions do cost resources/latency and it is actually not that relevant.

Why? Because you are not building a tracer but a monitoring tool.

Your dashboard is very hard to read because it is basically just tables and your tables do not give you any visual cue. I would highly recommend getting some icons from something like https://fontawesome.com/ and make it visual clear in what screen you are.

If you have any status text, like open/closed etc. give them appropriate color like green and red.

With your rules, it does look like code so why not with a coding theme? Like how githubs markdown makes code different from the rest of the text.

Give buttons appropriate colors.

with your live view, make sure that certain text is aligned vertically;

Give your Memory graphs proper units. 500000000 is not helpful.

I would highly recommend you to take a little bit of time to look to current existing and well working solutions so you can see what makes sense and what doesn't.

I would argue that not using something like Grafana for your frontend, is a big missing feature.

link

thrownaway954 2156 days ago

i totally i agree with you... it's very hard to discern what is going on at a glance. i have to say that i think this is more of a demonstration of the creator's talent then an actual product.

link

PyxRu94 2156 days ago

We have plan to migrate to click house for collect metric

For information in that post: https://www.reddit.com/r/devops/comments/hwej0w/squzy_new_op...

link