Hacker News new | ask | show | jobs
by aarondia 1214 days ago
Its not an AI based approach, but it is a step up from writing code by hand -- you could try using open source Mito -> https://www.trymito.io -> full disclosure I built it -> to do some of this messy data wrangling. Mito lets you view and manipulate your data in a spreadsheet in Jupyter and it generates the equivalent Python code for each edit. For things like identifying that the data uses '&' and 'and', viewing your data in a spreadsheet is >> just writing code.

Once you generate the code, you could copy it into your pipeline so that you pull the code from the last.fim api, preprocess it with the Python code that Mito generated, and then dump it into the LiteDB.