|
|
|
|
|
by sagiba
45 days ago
|
|
Curious what your use cases look like. If you're storing data where you always know what's there, who created it, and whether it's still in use without needing to query for it, that's actually a great place to be. The post is about the much messier middle ground most teams I've talked to are in. |
|
1. Invoice PDFs. Individually small, but there are a hundred million of them. Deleted after 10 years or when the tenant deletes their account.
2. Reports and exports. Few but potentially big files. If an export logically consists of multiple files, it's stored as zip file. Live 30 days or until the tenant deletes their account.
3. Streaming database exports using AWS Database Migration Service for replication into Snowflake
Every file has an entry in the database tracking its storage location and status.
Grouping them by tenant, (sub)type or time-interval makes sense for these. But "dataset" isn't an applicable concept.