|
|
|
|
|
by disgruntledphd2
1193 days ago
|
|
To be fair, it's not as funny as automating data cleaning, on the principle that data scientists don't want to do it. And yeah, lots of people dislike it, but you can't build models without an understanding of the data, so even if automated data cleaning became possible (unlikely) you'd still need to spend a load of time doing work on the dataset before building anything useful. |
|
Data cleaning requires a lot of judgement and domain knowledge. Imagine if an AI did clean your dataset. Are you just going to trust it (Hell no!)? Or are you going to spend ages trying to work out what it did, which doesn't seem much of an improvement.
I write data cleaning/ETL software and I'm confident that the need for my product is going to going up between now and when I retire.