Hacker News new | ask | show | jobs
by hhs19832 2870 days ago
Many posts here are focused on classic ETL. I'm working on a small project for handling data just after ETL.

It’s for dealing with annoyingly large data, bigger than RAM but sitting on a personal PC. It basically performs sampling and munging for this data. There’s no good solution for this right now (I know because I've been looking for more than a year).

What might be interesting to you is that there's little abstraction in the project, but it's non-trivial to execute. To me, this makes it fun. Despite its simplicity, it has high utility and could be used by others. This would be a great outcome for an initial project.

I've got a working version of it, but it would benefit from the eye of a seasoned python dev.

Maybe it would be interesting to you to get in touch? My email is: mccain.alex@yandex.com

Cheers,