| I've never really been able to figure out what a good strategy is for object storage organization. Do you create a bucket per application instance? user? organization? Right now I'm playing with a new service and came up with this which is probably over engineerd: Here is an actual object key including the bucket:
7dcdb229600e4467a2714866e0d406f6/85/26c/c0271374067b5db832adb7909a7/bbda55db15266f7ce2284d8f5f66fc85e495e2b12265ef87537237ad5e2658b24c081970332417f60e5fc352ae9b8c1031398c02ecde03eb29af2d3c8eda8a4b/y18.gif Given the file's uuid is aabbbcccccccccccccccccccccccccccccc for original images:
{{organizations_uuid as bucket}}/aa/bbb/cccccccccccccccccccccccccccccc/{{sha512sum}}/{{originalfilename}} And for all derivatives of it:
{{organizations_uuid as bucket}}/aa/bbb/cccccccccccccccccccccccccccccc/derived/{{this files uuid}}_{{filename}} My thinking was that: - using the organizations' uuid (which can have multiple users) as a bucket makes backing up per organization and having on prem deployments easier. - Encoding the file's uuid in the object name can identify it easily and by splitting that uuid up in 2/3/rest would help with spreading of objects. - Encoding the file's sha512sum in the key name would enable checking that file's integrity even without a database. - putting all derived files under derived but with the original file's uuid prefix makes the link between them clear. I know this will result in long object names as shown above in the actual example but it does include quite some information.
What parts of this is considered bad practice? Do you have any real world examples for other strategies? They seem hard to come by. |
Trivial example following your org/user pattern:
my_app/profile_images/org_id/user_id/aabbccccccc.jpeg
I then obviously have a reference to that file in my db.