Hacker News new | ask | show | jobs
Show HN: Alaz: Open-Source, Self-Hosted, eBPF-Based K8s Monitoring (github.com)
101 points by fatihbaltaci 1020 days ago
Hello Everyone,

I'm excited to introduce a new open-source observability platform and would love to hear your feedback.

We are aware that there are lots of open-source/commercial tools out there. However, we believe that monitoring the clusters and extracting actionable insights requires deep know-how about the tools/domain. We mainly focused on this problem.

- Alaz is an eBPF agent installed on your K8s cluster as DaemonSet. Thanks to eBPF, Alaz collects traces directly from Linux kernels. This means there's no need for sidecars, instrumentations, or service restarts.

- The UI not only visualizes data but also provides actionable insights. Using the Service Map, you can:

  - View latencies and RPS between services.
  - Detect zombie services and underperforming SQL queries.
  - Monitor golden signals, such as 5xx status codes.
In addition, Alaz can capture system resources like CPU, Memory, Disk, and Network through the Prometheus Node Exporter, which is embedded in the agent.

Setting up is straightforward: just install Alaz as a DaemonSet, and the platform will handle the rest.

Finally, the combination of Alaz and Ddosify Performance Testing makes it possible to do load testing and simultaneously monitor the system to find bottlenecks instantly.

For those interested, check out Alaz on GitHub: https://github.com/ddosify/alaz

Your feedback would be greatly appreciated!

10 comments

First and foremost, congratulations on launching such an innovative project! Your unique approach seems to significantly simplify the process of observing and gaining insights into Kubernetes clusters.

The fact that Alaz can collect traces directly from Linux kernels without the need for sidecars, instrumentation, or service restarts is truly impressive. Additionally, having a UI that not only visualizes data but also provides actionable insights is a great advantage.

I'm eager to follow your product closely and can't wait to try it out when the opportunity arises. I've starred the project on GitHub and am excited about its progress. Best wishes for your continued success!

Thank you for your kind words. We greatly appreciate your support, and we're excited for you to try out Alaz. If you have any feedback or suggestions, please don't hesitate to share them with us. Your input will help us improve our product. For support or to share your feedback, please join our community Discord channel at https://discord.com/invite/9KdnrSUZQg.
I will be honest with y'all, whenever someone coming with add-on sw launched into kernel I get anxiety attack as a platform engineer. This is very fragile matters both for performance as well as security. I see ddosify-alaz leverages prometheus node exporter for metrics collection which is awesome, using ebpf for service mash-map is promising but as I said I would prefer this part to be configurable to be not deployed if sre/platform teams do not want ebpf but only metric exporter capabilities to feed in to DDosify observability platform. Overall: Great Open source sw launch, please make it flexible for configuration. Cheers!
Thank you for providing us with your valuable feedback. At the moment, the metrics and eBPF sections are configurable but not yet documented. However, we are working on adding this information to our documentation as soon as possible. If you wish to disable the eBPF component and use only metrics instead, you can add the environment variable 'EBPF_ENABLED=false' to your alaz.yaml file.

Alaz utilizes minimal resources for both network and resource collection. CPU and Memory resources are limited in the K8s configuration file.

Alaz is still in its early stages of development and it's being improved based on developer feedback. Thanks again for providing us with your valuable input.

The one similar product I had come across is Kubeshark (https://github.com/kubeshark/kubeshark). But admittedly the eBPF way seems more performant theoretically (given you can afford to have a modern-enough kernel). I'm really excited to see how this project develops out.

The eBPF-mode of innovation is pretty exciting, truly a fresh lens to building software. I'm also following Akita Software - the company building an eBPF paradigm of monitoring.

Absolutely! Both Akita and Kubeshark have made significant contributions with their innovative approaches. However, Alaz - Ddosify stands out from the rest due to its effortless integration with the Ddosify Observability Platform (Cloud and Self-Hosted). This integration not only enables users to monitor their applications within the cluster but also to generate load tests for their applications that run in the cluster to spot glitches instantly. Moreover, the platform offers a unified dashboard that displays both the cluster and application performance metrics, service maps, and actionable insights (such as detecting zombie services), providing a comprehensive view of the system. Alaz is incredibly easy to set up by installing a DaemonSet into the cluster. We have many exciting developments planned for Kubernetes monitoring! With the help of our community, we are prioritizing the features on our roadmap. Thank you for your valuable input.
How is it different than pixie?
To accurately assess the performance of your system, it is required to generate data that can help identify any bottlenecks. To achieve this, you can use third-party performance testing tools and correlate the resulting data with monitoring data. Alaz uses eBPF technology, similar to Pixie, to collect system data which it then sends to the Ddosify Platform - whether on the cloud or self-hosted. In Ddosify, Observability and Performance-testing are natively integrated, so you can view any performance bottlenecks in real-time without the need for correlating the performance-testing and observability data. It's all in one platform and there will be additional features soon based on community feedback.
> Detect zombie services and underperforming SQL queries

I know eBPF is the latest coolest thing on the block, but sometimes I just don't get it.

There have been database and system tools to detect zombie services and underperforming queries since time immemorial.

If people do not have the technical competence (or are frankly, just too lazy) to make use of the tools available to them already, why is some random eBPF tool suddenly going to fix it ?

> There have been database and system tools to detect zombie services and underperforming queries since time immemorial.

as you mentioned "there have been......tools", there is always a tool best suited for a specific job, but it doesn't mean you need all the capabilities of that tool all the time. Too many tools lead to huge fragmentation inside the organisation to support multitude of tools

So why not have something one layer below your stack (kernel level) which can do good enough job to monitor almost everything you have. Its not only about if eBPF is cool, it is just reducing the fragmentation by going level below and extracting more info with a single suite of tools.

Would be good if you offer support / document to not use k8s just run on linux.
Thank you for your valuable feedback. We understand the importance of providing support for non-Kubernetes environments. We have already planned to make Alaz available for direct installation on Linux or Docker without the need for Kubernetes dependency. This feature is currently in development and will be released soon. Stay tuned for the updates!
Does the AGPL prevent anyone from running this on their own server for themselves (or not, since the source is provided by ddosify)?
No, it's a true FOSS license and not one of the scammy ones in the news of late, but some organizations straight up ban AGPL[1] because it is network viral in a way that hasn't been tested in the courts, so best to just avoid the hassle

I didn't mean my comment in a disparaging way, just trying to answer a FAQ on any such "Show HN" that uses the term "open source" but doesn't otherwise specify. Often that means it's a scam license, and I find that to be doubly true when the submitter craftily omits any mention of the license, but it is not the case here and I applaud the use of an actual FOSS license, even if that means I personally don't get to enjoy using this software

1: e.g. https://opensource.google/documentation/reference/using/agpl... (discussed: https://news.ycombinator.com/item?id=31176071 and likely a ton of others, too)

No, AGPL essentially just prevents someone from SaaS-ing your GPL project as a way around the license. It makes it so that if you can reach the code over the network, it's source has to be available to you.
Does it generate the Service Map from eBPF traces too? Can you use the Service Map without k8s?
> Does it generate the Service Map from eBPF traces too? Yes, Alaz generates the K8s service map from eBPF kernel trace points. If you wonder about the details and the internal of the system, check Alaz Architecture: https://github.com/ddosify/alaz/blob/master/Alaz-Architectur...

> Can you use the Service Map without k8s? We have already planned to make Alaz available for direct installation on Linux or Docker without the need for K8s. We are developing this feature now and will be released soon. Stay tuned for the updates!

Does it collect traces from compiled as well as scripted languages?
Currently, we don't collect traces from codes. However, we plan to make this feature available soon. Stay tuned for the updates! Currently, Alaz uses eBPF to collect network information from the kernel and correlate it with K8s services to generate a Service Map. In addition, Alaz utilizes Prometheus Node exporter to collect system resources such as CPU, memory, disk, and network from K8s nodes. This data is then sent to the Ddosify Platform for visualization. Our next plan is to collect code traces using eBPF. We understand that collecting code traces for scripted languages can be more challenging than for compiled languages, so we will utilize a hybrid approach using both eBPF and OTEL (Open Telemetry).
I am really curious, why build another project that has similar features as another open source software pixie - https://px.dev/.
Pixie is great, but Alaz stands out due to its effortless integration with the Ddosify Observability Platform (Cloud and Self-Hosted), allowing native integration of K8s Observability and Performance Testing.