Hacker News new | ask | show | jobs
by zenincognito 980 days ago
Or, you can also use openrefine from Google.

Currently mangling a 4 GB file and working with api's that use existing data columns to provide output.. Its a great tool.

1 comments

How does it work for much larger files?
Have found that to be the most succesful amongst a horde of other tools tried.. I have had no problems with file as big as 8-10 gigs as I can allocate more memory to the program as I see fit.

Honestly, given that I can use grel/clojure/python inside to clean up and mangle data seems to make it the swiss knife of data segmentation/cleanup.

Nice, thanks for sharing!