Hacker News new | ask | show | jobs
by nextaccountic 787 days ago
A question about implementation, is the data really stored in a Postgres database? Do you support transactional updates like atomically updating two files at once?

Is there a Postgres storage backend optimized for storing large files?

1 comments

We do not store the files in Postgres, the files are stored in a managed S3 bucket.

We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.

It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level

Oh.

So this is like, S3 on top of S3? That's interesting.

Yes indeed! I would call it S3 on steroids!

Currently, it happens to be S3 to S3, but you could write an adapter, let's say GoogleCloudStorage and it will become S3 -> GoogleCloudStorage, or any other type of underline Storage.

Additionally, we provide a special way of authenticating to Supabase S3 using the SessionToken, which allows you to scope S3 operations to your users specific access control

https://supabase.com/docs/guides/storage/s3/authentication#s...

What about for second tier cloud providers like Linode, Vultr or UpCloud, they all offer S3 compatible object storage, will I need to write an adaptor for these or will it work just fine in lieu of their S3 compatibility?
Our S3 Driver is compatible with any S3 Compatible object Storage so you don’t have to write one :)
Gentle reminder here that S3 compatability is a sliding window and without further couching of the term it’s more of a marketing term than anything for vendors. What do I mean by this statement? I mean that you can go to cloud vendor Foo and they can tell you they offer s3 compatible api’s or clients but then you find out they only support the most basic of operations, like 30% of the API. Vendor Bar might support 50% of the api and Baz 80%.

In a lot of cases, if your use case is simple, 30% is enough if you’re doing the most common GET and PUT operations etc. But all it takes is one unsupported call in your desired workflow to rule out that vendor as an option until such time that said API is supported. My main beef with this is that there’s no easy way to tell usually unless the vendor provides a support matrix that you have to map to the operations you need, like this: https://docs.storj.io/dcs/api/s3/s3-compatibility. If no such matrix is provided on both the client side and server side you have no easy way to tell if it will even work without wiring things in and attempting to actually execute the code.

One thing to note is that it’s quite unrealistic for vendors to strive for 100% compat - there’s some AWS specific stuff in the API that will basically never be relevant for anyone other than AWS. But the current situation of Wild West could stand for some significant improvement

I’m confused about what directions this goes.

The announcement is that Supabase now supports (user) —s3 protocol—> (Supabase)

Above you say that (Supabase) —Supabase S3 Driver—> (AWS S3)

Are you further saying that that (Supabase) —Supabase S3 Driver—> (any S3 compatible storage provider) ? If so, how does the user configure that?

It seems more likely that you mean that for any application with the architecture (user) —s3 protocol—> (any S3 compatible storage provider), Supabase can now be swapped in as that storage target.

in case it's not clear why this is required, some of the things the storage engine handles are:

image transformations, caching, automatic cache-busting, multiple protocols, metadata management, postgres compatibility, multipart uploads, compatibility across storage backends, etc