Hacker News new | ask | show | jobs
by bgdam 2799 days ago
I'm sure that there are tasks for which non-relational stores might be better suited, but pretty much anything MongoDB does, Postgres does it better and in a more performant way (esp after the optimizations to the JSONB data type in the last few major releases).
3 comments

Performance is definitely an important aspect, but coming from a background working at smaller companies, bootstrapping time and development time are a factor as well -- and while Postgres is easy to set up if you're familiar with it (and I dare say it makes DB design fairly straightforward), Mongo and other NoSQLs are pretty objectively easier to just get working if you aren't fluent in relational databases.
No offense, but if you don't know how to design/develop a database schema, making it easier to shoot yourself in the foot by using NoSQL database is not the answer.
None taken. And I agree -- but database design is inherently different under SQL and non-/noSQL. Constraints and relationships may be modelled completely differently in either case.
So, I've been trying to figure this out a bit lately.

How would you go about writing a customer facing query builder that is analogous to the MongoDB aggregation pipeline with SQL?

With MongoDB I could conceivably generate/store a JSON object for such a query. In SQL it seems a lot more obtuse to do.

Keep it simple: stick with MongoDB-like JSON objects for queries but write some code to map those queries to SQL immediately before execution.
This is a very bold comment, explain how you scale write, do HA in pg?
This is not what I called HA or horizontal scaling. Have a look at what MongoDB or Cassandra do.
Unless you're operating at Google/FB/Twitter/etc scale, this is not a problem most of us face IRL. We need to stop pretending we have these problems at 99% of the places we work at. Most of us don't. We don't need 1,000 microservices, Kubernetes, Docker, and a terribly unreliable NoSQL database like MongoDb (there are better, more optimal solutions for those companies than MongoDb (Cassandra being one of them.))
The fact that someone has to manually promote a slave as master is a big operational cost. The people that actually use PG at scale have to write a lot of tools / plumbing to overcome those issues, the same "issues" that were solved by most NoSQL solution, horizontal scaling, HA, shards re-balancing ect ...
> The fact that someone has to manually promote a slave as master is a big operational cost.

It doesn't need to be manual, only in a layer above/outside PostgreSQL. There are solutions that automate leader elections for any secondary-promotable-to-primary system (not just PostgreSQL), just not built into PostgreSQL.

We looked for about two years, on and off, to setup smoother, less-manual HA for our PostgreSQL setup. We ended up writing some, as you say, "tools / plumbing to overcome those issues". By far, though, the most time we spent on it was been spent in reading, testing, and getting to understand how things work.

After everything, unfortunately, no other system fit our requirements. Many came close in their marketing copy, but on closer look, every system had caveats. Ultimately, PostgreSQL is the only system I could still trust to do the right thing. In fact, the solid foundation that Postgres provides is why I can trust third-party layers (similar to Stolon and Patron) to focus on the job of switchover correctly without losing or otherwise messing with my data. In fact, I can even switch which tool I use, without it affecting my data.

The little plumbing we've written over Pg's built-in replication allows us to horizontally scale using logical replication and assuredly have HA using physical replication. At the end of the day, any shop that does anything important with data at scale needs to know how their data is actually being treated, irrespective of what the marketing promises. Hell, even when the data-store does not blatantly lie to you about keeping your data safe or uncorrupted, you can lose data due to bad use of tooling or explicitly-set unsafe configuration. "I'll just use this; it'll solve all my problems!" is rarely a thing to blindly believe when dealing with important data.

You're technically correct, but this case study was really eye-opening to me: http://blog.sagemath.com/2017/02/09/rethinkdb-vs-postgres.ht... They migrated from a RethinkDB cluster (similar to MongoDB) to a single Postgres server and the result was faster. And a single server is very large today, like 48 cores and 384 GB RAM.

And speaking of Postgres HA management, master promotion, etc. there are several tools like https://github.com/sorintlab/stolon available.

Citus, but it takes a while. Until you need thatw