|
|
|
|
|
by BrandonY
1010 days ago
|
|
Hi, Brandon from GCS here. If you're looking for all of the guarantees of a real, POSIX filesystem, you want to do fast top level directory listing for 100MM+ nested files, and POSIX permissions/owner/group and other file metadata are important to you, Gcsfuse is probably not what you're after. You might want something more like Filestore: https://cloud.google.com/filestore We've got some additional documentation on the differences and limitations between Gcsfuse and a proper POSIX filesystem: https://cloud.google.com/storage/docs/gcs-fuse#expandable-1 Gcsfuse is a great way to mount Cloud Storage buckets and view them like they're in a filesystem. It scales quite well for all sorts of uses. However, Cloud Storage itself is a flat namespace with no built-in directory support. Listing the few top level directories of a bucket with 100MM files more or less requires scanning over your entire list of objects, which means it's not going to be very fast. Listing objects in a leaf directory will be much faster, though. |
|
Our theoretical usecase is 10+ PB and we need multiple TB/s of read throughout (maybe of fraction of that for writing). So I don’t think Filestore fits this scale, right?
As for the directory traversals, I guess caching might help here? Top level changes aren’t as frequent as leaf additions.
That being said, I don’t see any (caching) proxy support anywhere other than the Google CDN.