|
|
|
|
|
by paddy_m
588 days ago
|
|
Nice work! Do you have any plans for data cleaning? I am working on a somewhat similar open source project. I intend to add heuristic data cleaning. With the UI I want to be able to toggle between different strategies quickly - strip characters from a column to treat it as numeric, if less than 2% or 5% of values have a character, fill na with mean, interpret dates in different formats - drop if the date doesn't parse. The idea bing that if it's really quick to change between different strategies, you can create more opinionated strategies to get to the right answer faster. Happy to collaborate and talk tables with anyone who's interested. |
|
> With the UI I want to be able to toggle between different strategies quickly - strip characters from a column to treat it as numeric, if less than 2% or 5% of values have a character, fill na with mean, interpret dates in different formats - drop if the date doesn't parse
Those are really good examples and I can make those predefined preproccesing techniques available as toggles in the dataset tab. Thanks for the feedback!