Hacker News new | ask | show | jobs
by brynb 2643 days ago
We’re building something along these lines at Axon (http://axon.science). Sign up for our beta if you’re interested in checking it out, and we should be able to get you set up in the next few days (we’re just starting to roll things out to the public this week).

The basic idea is distributed version control, like git, but over p2p swarms rather than clusters around “central” repositories. We have special handling for large datasets (but still using git) to improve transfer efficiency and diffing.

There’s a UI layer for collaboration (discussion, PRs, review) that supports deep linking to and embedding of files at specific commits, which sounds a bit like what you’re looking for.

Feedback is very much appreciated!

1 comments

That looks very interesting, particularly the UI layer for collaboration. Your website says it supports “massive data sets” but I would spell out what you mean since data for different fields vary by several orders of magnitude. (Massive for me starts at TBs and goes to petabytes.)

One of the issues for me is file-based versioning, which then requires the means to parse the format. A number of ventures and organizations (e.g., NeuroData without Borders) address versioning of the entire ecosystem necessary to correctly use the underlying data files, so not sure if that’s an explicit part of your ecosystem. Most importantly, is your stack going to be open source?