but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.
Therefore, it's possible that the id is invalid (for the external storage) when referenced in the future. I think doing so only adds complexity as system grows.
It would be better to chunk your blob data to fit the DB, imho. It beats introducing external blob storage in the long run.
> but now transactional guarantees only extend to the id stored in the DB, and not on the external storage.
Depends! If the ID is a cryptographic hash, then as long as the blob is uploaded first, then the DB can't be inconsistent with the blob[1].
A Merkle Tree also allows "updates" by chunking the data into some convenient size, say 64 MB, and then building a new tree for each update and sticking that into the database.
[1] With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...
> With the usual caveats that nobody is manually mucking about with the blob store, that it hasn't "lost" any blobs due to corruption, etc, etc...
Yeah, with those caveats. But how do you make sure they apply? If someone does manually muck about with the blob store, or it does lose blobs due to corruption, then your transaction is retroactively "un-atomicized" with no trace thereof in the actual DB.
If corruption happens then any guarantees by the hardware are voided, and the software guarantees (of durability) which are built o the hardware guarantees are equally voided. So corruption -> dead in the water.
But if you upload to the blob store first, then add to the transaction (in your db insert) with the id (hash or not), what happens if the db transaction fails? You now have to work out a way to delete off the blob external store. Or change your application so that that it doesn't matter if it's left on the blob store ('cept for money).
Therefore, it's possible that the id is invalid (for the external storage) when referenced in the future. I think doing so only adds complexity as system grows.
It would be better to chunk your blob data to fit the DB, imho. It beats introducing external blob storage in the long run.