|
|
|
|
|
by impresburger
404 days ago
|
|
Hey, one of the authors here! I completely agree with your comment. Training ML models on a clean dataset is the "easy" and fun part of an ML engineer's job. While we do think our approach might have some advantages compared to "2018-style" AutoML (more flexibility, easier to use, potentially more intelligence solution space exploration), we know it suffers from the issue you highlighted. For the time being, this is aimed primarily at engineers who don't have ML expertise: someone who understands the business context, knows how to build data processing pipelines and web services, but might not know how to build the models. Our next focus area is trying to apply the same agentic approach to the "data exploration" and "feature ETL engineering" part of the ML project lifecycle. Think a "data analyst agent" or "data engineering agent", with the ability to run and deploy feature processing jobs. I know it's a grand vision, and it won't happen overnight, but it's what we'd like to accomplish! Would love to hear your thoughts :) |
|
I respect software engineers a lot, however ANYONE who "doesn't know how to build models" also doesn't know what data leakage is, how to evaluate a model more deeply than simple metrics/loss, and can easily trick themselves into building a "great" model that ends up falling on its face in prod. So apologies if I'm highly skeptical of the admittedly very very cool thing you have built. I'd love to hear your thoughts.