Hacker News new | ask | show | jobs
by wisebumblebee 1052 days ago
There's been a growing amount of research on the topic of data-centric ai, now with software being dedicated to it. This one is super fresh in Neurocomputing, which is a Q1 publication.

In short, ydata-profiling is a Python tool that generates a detailed report about the data, including missing values, distribution of data, correlations, and data quality alerts, etc.

I work specifically in data quality (imbalanced and missing data) so I've been following the project for a while, but I'm curious whether you make a case of really exploring your data characteristics beforehand and how serious do you consider these alerts.

Do you think this shift towards a "data-centric" approach in AI is really set to be the "next big paradigm" in AI? It's cool to see it valued, but idk...