Hacker News new | ask | show | jobs
by enigmo 4324 days ago
Have you considered using a pre-existing distributed filesystem like Ceph or even HDFS instead?
1 comments

I did consider both. HDFS, from what I read, is designed for storing large files - my files are about 10Kb in size each.

I am aware of Ceph but have not tried installing it to see how easy/hard it is to setup. Also, although this is not a hard requirement, I'd like to be able to support Windows; from what I have read so far, Ceph does not support Windows.

Ceph's filesystem interface is built on a lower level protocol which is available through the librados [1] API, which might be a possible solution if it builds for (or can be ported to) Cygwin / Windows and all you need is a client on Windows.

This API is more S3-like, operating directly on "objects" in the Ceph cluster. I wrote a system for storing many small (10-20kB) files using librados & Ceph, but the performance wasn't as good as I had hoped. Possibly I did not configure the Ceph cluster in the optimal manner - the cluster setup is quite complex.

[1] http://ceph.com/docs/master/rados/api/librados-intro/