| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwawaaarrgh 925 days ago
	> Postgres sits at the heart of everything our systems do. Did the people making these decisions never take Computer Science classes? Even a student taking a data structures module would realize this is a bad idea. There's actually more like two dozen different reasons it's a bad idea.

3 comments

thestepafter 925 days ago

Would be interested to hear more about your opinion on why using a database is a mistake.

link

lmm 925 days ago

Using a datastore for which true master-master HA is at best a bolted-on afterthought when you explicitly want a zero-downtime system is a mistake in a pretty obvious way.

Using a datastore with a black box query planner that explicitly doesn't allow you to force particular indices (using hints or similar) is a more subtle mistake but will inevitably bite you eventually. Likewise a datastore that uses black-box MVCC and doesn't let you separate e.g. writing data from updating indices.

link

throwawaaarrgh 925 days ago

I meant using a database for more than relational read-heavy data queries. I would need to write a small book. Tl;dr the data model, communication model, locking model, and operational model all have specific limitations designed around a specific use case and straying from that case invites problems that need workarounds that create more problems.

link

brentjanderson 925 days ago

I hear you on that, and can say that Postgres is incredibly capable at going beyond typical relational database workloads. One example are durable queues that are transactionally consistent with the rest of the database play a unique role in our architecture that would otherwise require more ceremony. More details here: https://getoban.pro

We are also working on shifting some workloads off of Postgres on to more appropriate systems as we scale, like logging. But we intentionally chose to minimize dependencies by pushing Postgres further to move faster, with migration plans ready as we continue to reach new levels of scale (e.g. using a dedicated log storage solution like elastic search or clickhouse).

link

pphysch 925 days ago

Is this a bit? The median CS undergrad has zero experience with large & successful software systems in the real world. Of course they wouldn't understand!

link

yjftsjthsd-h 925 days ago

Yeah - in fact, this is probably a great example of stuff you don't learn in class that gets really clear in the real world:) Operational concerns trump a lot of other things, and shoving everything you can into 1 database technology is so much better to manage that it covers a lot of suboptimal fit.

link

peter_l_downs 925 days ago

What do you mean? I don’t understand, how is using a database an architectural mistake?

link

throwawaaarrgh 925 days ago

It's a mistake to use one specific computer science concept (RDBMS) to solve 50 different problems. They mentioned logging and scheduling, two things RDBMS are not designed for and have specific limitations around. From just a general architecture perspective it's literally a single point of failure and limitation for every single aspect of the system. And it's vendor specific, it's not like you can just plug plsql code into any other RDBMS and expect it to work. It's so obviously a bad idea it's hard to comprehend taking it seriously

link

toast0 925 days ago

It might not be good computer science to use one tool to solve 50 different problems; but it's not bad computer engineering to use one tool to solve 50 different problems that fit within its capabilities rather than using 50 different tools, all with their own operational expertise.

There's no need to have the best tool for every job. Although it's also important to be able to see when a many purpose tool is insufficient for a specific job as it exists in your system and then figure out what would be more appropriate.

link

camgunz 925 days ago

You'd probably be surprised by how many systems are just Postgres/mysql + Redis.

link

crooked-v 925 days ago

For example, it's dead easy to make a high-capacity message queue by just using SELECT ... FOR UPDATE SKIP LOCKED with Postgres transactions, and I would argue it's more reliable than a lot of microservice-everything setups by way of having very few moving parts.

link

throwawaaarrgh 925 days ago

Classic NIH syndrome. "I made it myself so it must be better", when it's clear that a single sql query doesn't remotely approach a complete solution for scheduling. But the ignorant use it because they don't know better, until they too fall into the trap and realize they spent 10x as much engineering work to get something they could have just installed from the web and been done with. Every generation seems to fall into this trap with another tech stack.

link

vore 925 days ago

It's all trade offs, right? Introducing a new component to your stack isn't free, it's paid for by more operational complexity. Maybe it's worth it, maybe it's not, but there is a calculation that needs to be made that's not just "NIH syndrome".

link

sgarland 925 days ago

If you already have a DB (and essentially every app does), it can be far less effort with the same or greater reliability to create a queue table than to set up RabbitMQ, NATS, etc. As long as you tune the vacuuming appropriately, it will last for quite a lot of scale.

Source: am a DBRE, and have ran self-hosted RabbitMQ and NATS clusters.

link

justinclift 925 days ago

Sure, so install RabbitMQ as well.

As the saying goes... "now you have two problems". :)

link

lgkk 925 days ago

You could honestly just do in memory SQLite and use that lol idk that’s what I did because I wanted to quickly be able to handle thousands of simultaneous scheduling tasks.

Took like two hours and it works fine. Customers are happy. Event logs persist to s3 in case I need to replay (hasn’t happened once yet).

link