Nice. But it seems some easy improvements are possible: Do you really need to call get_blob(), downloading the entire blob, when a file is opened?
Similarly, in read() you always download the entire blob (again). It seems that get_blob() comes with an x_ms_range parameter that would allow you to specify the byte range you actually want.
The problem with such remote storage APIs and fuse is that you should sacrifice consistency or speed. I once tried out a s3 wrapper and it was caching some data about fs and ended up being inconsistent.
I didn't really understand the use case for this until the last paragraph. Might help others:
...imagine there’s a case where you would like to upload your daily log files to the cloud with a cronjob, then you can mount AzureFS upon startup and let your cronjob just copy the files to the cloud very easily.
One of my recent weekend projects was implementing a simple command-line interface to the Azure blob storage in Haskell. It's not quite finished, but may still be useful to some. Check it out at https://github.com/ArnoVanLumig/azurify
I think I'm right in suggesting it's free with a bizspark licence, which is a kind of a free licence available to startups.
Whilst the platform (Azure) is clearly proprietary, they're clearly making efforts to offer cross platform IaaS and cross platform client (OpenSource) capabilities for their PaaS offerings.
It maybe a crock of shite, it maybe the best thing since sliced bread but I for one feel that that particular team deserves credit for trying (hard) to adopt open source and make themselves an attractive proposition outside Microsofts traditional core strengths. Time will tell if they're make a success of this...
There are already solutions that can mount s3 as file system, for years. I know it is more widespread but I work for Microsoft Azure and I provided a solution that didn't exist for Azure.
Could you provide some outages you're referring to for S3 reliability? I've been using it for the last 4 years and not had a problem.
My experience is also backed up by Nasuni's Cloud Storage Benchmark from December that report that AWS has 1.4 outages a month (compared to Azure's 11.1 a month) and concluded that AWS was faster than Azure (AWS & Azure being the top performers)
I coded it and tested it a little bit. Not all features work and I don't recommend you to use it. OSX issues kinda different filesystem calls that I should have been handled but I didn't.
Similarly, in read() you always download the entire blob (again). It seems that get_blob() comes with an x_ms_range parameter that would allow you to specify the byte range you actually want.