|
|
|
|
|
by ragulpr
583 days ago
|
|
Love this idea! Biggest hurdle though have been to have predictable Auth&IO across multiple Python/Scala versions and all other things (Spark, orchestrators, CLI's of teams of varying types of OS etc etc) add to that access logs. SF3s/boto/botocore versions x Scala/Spark x parquet x iceberg x k8s etc readers own assumptions makes reading from S3 alone a maintenance and compatibility nightmare. Will the mounted system _really_ be accessible as local fs and seen as such to all running processes? No surprises? No need for python specific filesystem like S3Fs? If so then you will win 100% I wouldn't even care about speed/cost if it's up to par with s3 |
|
> Will the mounted system _really_ be accessible as local fs and seen as such to all running processes? No surprises? No need for python specific filesystem like S3Fs?
Ha, well it depends on what you mean by surprises. We won't have a Python-specific file system. Our client is going to come in two flavors. Today, you can mount Regatta over NFSv3 (which we wrap in TLS to make it secure). This works for some workloads, but doesn't provide like-for-like performance with EBS. Over the next month, we plan to release the "custom protocol" that I wrote about above, that we expect to send to customers in the form of a FUSE file system.
Either way, it should be one package, you shouldn't need to worry about versioning, and it will appear as a real, local file system. :D