Hacker News new | ask | show | jobs
by Dave3of5 951 days ago
A bit too waffling for me to read all but I would like to make a small comment.

Why are more and more devs trying to use s3 as a general purpose DB?

Working on a system right now where the architects have made this mistake it has insanely poor performance (High latency) and lack any proper ACID compliance. I've now been asked to "make it faster" and the answer is to switch back to an actual DBMS.

> Top tier SaaS services like S3 are able to deliver amazing simplicity, reliability, durability, scalability, and low price because their technologies are structurally oriented to deliver those things. Serving customers over large resource pools provides unparalleled efficiency and reliability at scale

In terms of simplicity using s3 is anything but simple. Sure the CRUD api is simple but there are a bunch of gotchas. What about transactionality, partial updates, running multi document queries, consistency of the whole set of documents. You have to rewrite a whole DBMS on top of s3 itself or use redshift to get these things.

In terms of scalability there are, limits 3500rps per key prefix.

It's actually not lower price than a DBMS when you have a lot of data.

3 comments

The serverless pitch is extremely appealing to many companies. Also, many serverless offerings seem like a great choice at the beginning. But at some point, you will want SQL features like joins, groups, secondary indexes, foreign keys etc.

The missing link is really a serverless postgres (which many are working on but nothing has impressed me so far.

Sorry if this is ignorant, but I never understand what serverless Postgres means. What's different from a hosted Postgres instance? Some scaling characteristics or the fact you interact with it via an API instead of some library, ORM, or plain SQL?
Serverless in that context essentially means “somebody else's server(farm)”. It frees you from some of the infrastructure/admin involved in server sizing & upgrades, backups, availability management, and so on.

It can be very attractive to teams who don't want to have an internal expert for all that, or to buy huge hardware to deal with spikes in activity that only happen occasionally⁰. Just being able to spin up a large DB for some tests without worrying about available space, how much it will compete for this like CPU/IO with your other DBs¹, etc, can be very convenient.

It can work out quite expensive in terms of price/performance ratio, if those factors are not a benefit to you.

----

[0] or happen regularly but usually for only a short time

[1] usually these things are capped, or have a burstable cap, so “noisy neighbours”² are not the huge problem they can often be on cheap shared hosting

[2] unless you have explicitly pooled resources (like Elastic Pools for Azure SQL) without per-object limits, in which case your own activity could be harmful noise

A lot of "serverless" things are really on-demand timesharing.

One example of "serverless postgres" in my opinion would mean data is on a blob store and you only pay when running queries and for the static storage.

Basically snowflake's pricing model.

Serverless = not having to think of server size for running the database, it scales as needed.
> The missing link is really a serverless postgres

Used AWS Aurora Serverless v2 (MySQL) and it worked pretty good actually never used the postgres version but it's now available.

We went this way with our Synapse/Azure Data Lake solution, and it has been nothing but pain since. I'd estimate than in the past year more than 60% of the dev time was spent fighting random edge cases that this kind of approach brings.

Sure, there is the benefit of being able to dump your cold data in cheaply and read flexibly, but... the dev ux is just PITA.

What do you think of Microsoft' latest offering Fabric? Is Fabric real software that makes things better? Or is just lipstick on Synapse?
> You have to rewrite a whole DBMS on top of s3 itself or use redshift to get these things.

No, you get a DBMS and only change the storage underneath. You can't use S3 for appending to WAL though.

All those can be fixed besides the latency for a cold GET from S3 and appending WAL to S3.

> No, you get a DBMS and only change the storage underneath

I think what you mean is what we have implemented a side channel DBMS which holds a copy which you use for the transactionality. It's a terrible approach I would not do this at all you don't get any benefit from using s3 here.

This is not to say you can't use s3 to pull large blob storage off the DB and reference it in the DB I'm talking about the entire DB as s3.

> I'm talking about the entire DB as s3.

Yes, it can be done, except for WAL append & COLD GET. You "just" have to re-architect everything in the storage layer.

Do you have anything specific in mind besides the 2 things I mentioned?

You haven't given specific of this to be blunt no idea what you mean.

I already said my points about the problems your follow-up comments have not addressed those.

Also there is no such thing in s3 as a cold GET there is a cold startup for a lambda. The latency from s3 on a get is orders of magnitude poorer that EBS or LSSD and should be used "infrequently".

Imagine using rocksdb backed by S3. You store only sstables in S3. You append the WAL in EBS and archive it in S3. You cache blocks in local SSD. If you want faster WAL, you append the WAL in local-ssd and use multi-az replication in your db and fsync WAL to EBS less frequently than locally.

> What about transactionality, partial updates, running multi document queries, consistency of the whole set of documents.

You do that in another layer on top. The filesystem doesn't provide transactions, yet you do them on a layer on top.

> In terms of scalability there are, limits 3500rps per key prefix.

The S3 metadata is (was?) sharded on key-prefix. You fix this by using more prefixes. By hashing the filenames or something.

> The latency from s3 on a get is orders of magnitude poorer that EBS or LSSD and should be used "infrequently".

Yes, it is. You should use a local SSD for most things.

This is basically how Fireproof works: files to S3, reads from any cache, encryption metadata in your session store. All of this becomes “easy” if you write a storage engine from scratch for immutable content addressed data.
You defintiely can use S3 for appending to a WAL (I've done it), they have read-after-write consistency
You're doing it for large OLAP writes. But as soon as you do OLTP or small writes it will become very slow & expensive.

See warpstream as example https://news.ycombinator.com/item?id=37036291

How about using Kafka for the WAL? Anybody tried that?
LogDevice or apache bookeeper is better for distributed log.