Hacker News new | ask | show | jobs
by jarpineh 4242 days ago
Wow, thank you, again. I have to definitely take a look this. My use cases tend to vary so much that creating a Hadoop like system would require too much custom coding.

I wonder if it is possible to have compression and de-duplication, so that there could be a one big base dataset and lots of containers that only add what new data they generate.

Anyhow, looking at this it feels really approachable. What I have in mind are quick-and-dirty data-sciency scripts for ad hoc use cases, like diffing structured files and combing over matrix data.