Hacker News new | ask | show | jobs
by sdesol 1882 days ago
I was analyzing the activity in the netdata project and what I found interesting was this project is less active than I would have thought. See the following for insights into the project:

https://public-001.gitsense.com/insights/github/repos?q=wind...

In the last 30 days, there were 2 frequent and 3 occasional contributors. I honestly thought frequent contributors would have been much higher, which leads me to believe the project is quite mature and they don't need a lot of people to work on netdata.

Based on Crunchbase, they've raised about 33 million so far and if the number of people required to maintain netbase is low (relatively speaking that is), I can see them not really needing to worry about making money and I'm guessing they are finding value in gathering data for ML.

2 comments

> they've raised about 33 million

yes, this is right

> if the number of people required to maintain netbase is low (relatively speaking that is)

The Netdata agent is a robust and mature product. We maintain it and we constantly improve it, but:

- most of our efforts go to Netdata.Cloud

- most of the action in the agent is in internal forks we have. For example, we are currently testing ML at the edge. This will eventually go into the agent, but is not there yet. Same with EBPF. We do a lot of work to streamline the process of providing the best EBPF experience out there.

> I can see them not really needing to worry about making money

We are going to make money on top of the free tier of Netdata.Cloud. We are currently building the free tier. In about a year from now we will start introducing new paid features to Netdata.Cloud. Whatever we will have released by then, will always be free.

> I'm guessing they are finding value in gathering data for ML

No, we are not gathering any data for any purpose. Our database is distributed. Your data are your data. We don't need them.

Hey thanks for the insights. I figured effort was being spent elsewhere and/or was not visible in the public repo.
oh cool that's a nice tool.

p.s. i am the only person working on ML at Netdata and i can confirm we don't gather any data for ML purposes, which is actually my biggest challenge right now :) - convincing people the ML can be useful without having lots of nice labeled data from real netdata users to be able to quantify that with typical metrics like accuracy etc. I'm hoping to introduce mainly unsupervised ML features into the product that don't rely on lots of labeled data and have thumbs up/down type feedback and we can then use that to figure out if new ML based features are working or being useful for users. So any models that would be trained would be trained on the host and live on the host as opposed to in Netdata Cloud somewhere.

> i am the only person working on ML at Netdata and i can confirm we don't gather any data for ML purposes, which is actually my biggest challenge right now :)

Yeah I would have to imagine that it would be an issue. This is just my personal opinion, but I think there should be a way to provide anonymized data for building models for anomaly detection. Maybe an opt-in feature, as it would benefit everybody using netdata, but this is just my own personal opinion.