Hacker News new | ask | show | jobs
by dynamicwebpaige 3188 days ago
Getting data into a reasonable format is (unfortunately!) a huge part of the machine learning process; that was the motivation for tools like pandas (in Python) and dplyr (for R). Joseph is describing a machine-learning-enabled way to automate some of that data cleaning, which is pretty cool.

Check out the PROSE SDK (including an interactive playground) here. I particularly like its ability to extract JSON to something resembling a dataframe: https://microsoft.github.io/prose/documentation/extraction-j...