|
|
|
|
|
by dweinus
407 days ago
|
|
I don't want to hate, what you built is really cool and should save time in a data scientist's workflow, but... we did this. It won't "automate most of the ML lifecycle." Back in ~2018 "autoML" was all the rage. It failed because creating boilerplate and training models are not the hard parts of ML. The hard parts are evaluating data quality, seeking out new data, designing features, making appropriate choices to prevent leakage, designing evaluation appropriate to the business problem, and knowing how this will all interact with the model design choices. |
|
While we do think our approach might have some advantages compared to "2018-style" AutoML (more flexibility, easier to use, potentially more intelligence solution space exploration), we know it suffers from the issue you highlighted. For the time being, this is aimed primarily at engineers who don't have ML expertise: someone who understands the business context, knows how to build data processing pipelines and web services, but might not know how to build the models.
Our next focus area is trying to apply the same agentic approach to the "data exploration" and "feature ETL engineering" part of the ML project lifecycle. Think a "data analyst agent" or "data engineering agent", with the ability to run and deploy feature processing jobs. I know it's a grand vision, and it won't happen overnight, but it's what we'd like to accomplish!
Would love to hear your thoughts :)