|
|
|
|
|
by jarpineh
4242 days ago
|
|
Wow, thank you, again. I have to definitely take a look this. My use cases tend to vary so much that creating a Hadoop like system would require too much custom coding. I wonder if it is possible to have compression and de-duplication, so that there could be a one big base dataset and lots of containers that only add what new data they generate. Anyhow, looking at this it feels really approachable. What I have in mind are quick-and-dirty data-sciency scripts for ad hoc use cases, like diffing structured files and combing over matrix data. |
|