Hacker News new | ask | show | jobs
by erik14th 3162 days ago
Everyone here seems to love Postgres and hate on Mongo, I have no technical knowledge to compare the two, so IMO a lot of that love and hate is more towards the "project attitude", MongoDB is a suit, a sellout, with bullshit marketing and all Postgres is like some roots hippie, that cares a lot more about technical values and neglects marketing.
2 comments

I think most of it was a reaction to the massive marketing push Mongo made. There were so many years of people hyping it as the answer to all of your data storage needs, followed by accounts of users hitting bugs or reimplementing most of a SQL database in code, and the promised performance benefits either didn't materialize or hadn't been necessary in the first place (“web scale” turned out to mean only dozens to hundreds of requests per second for many apps).

Meanwhile, Postgres was quietly plugging away adding new features and continuing to deliver solid performance for a wider range of workloads, including better performance on JSON document storage.

MongoDB accrued a lot of bad-will due to some extremely questionable defaults, which remain defaults to this day. There's no question that you can write a fast database when there's no guarantee that data ever hits the disk, but developers tend not to like it when a database accepts their write and then silently loses data. It's also great for toy problems and 15-minute-demos... but then you inevitably run into its limitations and end up re-implementing a database in your app.

Even at its best, there is essentially no reason to choose MongoDB over Postgres with JSONB-type columns. They are essentially the same data model but Postgres gives you better guarantees of data consistency, plus a forward migration path to relational data when the day inevitably arrives when you need to model relationships between entities.

At this point Postgres is where most open-source RDBMS development work is concentrated. It's not only a solid codebase, it's piling up features pretty quickly and there are relatively few niches it doesn't fill at least adequately. All of these niches are covered by some commercial products built on top of Postgres (eg EnterpriseDB or CitusDB). It's pretty much a one-stop shop for application development. You can use it for everything from GIS to machine learning [0] pretty efficiently, and it pretty much will just do the right thing without you watching.

NoSQL really fits best around the margins, like as an auxiliary system for analytics. There is really almost no use-case where "user inputs data and we lose it" is an acceptable application behavior, so consistency is a business requirement for your master database whether you realize it or not. And consistency across a distributed system is hard so it almost always makes sense to sidestep clustering until the last possible moment. Buying more machine is cheap, replication/failover is a lot easier than consistency between distributed masters, and if you are really up against the wall there are those commercial products that can do this with Postgres.

If you want to make an analogy... Oracle is the suit, Postgres is the hardworking small business that is slowly but surely eating up Oracle's lunch, and MongoDB is a trustafarian with a hot-dog detector app. And that's why there's a lot of resentment towards MongoDB.

[0]: The 9.x series and 10.0 release have been absolutely jam-packed with new features, it's absurd how fast development is moving at the moment. One of my favorites... indexed cube queries. A cube is a data-cube type, an N-dimensional cube of data. One feature of this is distance queries, which have obvious applications in pattern recognition tasks (eg k-nearest-neighbor). One of the features in 9.6 is index functionality for these, so you can now do indexed KNN searches on your data...

https://www.depesz.com/2016/01/10/waiting-for-9-6-cube-exten...

> NoSQL really fits best around the margins, like as an auxiliary system for analytics.

I'd say it also fits well in two niches: document datastores (so long as there's some JOIN support, via referencing nested documents vs direct nesting) and graph stores.

I remember 10+ years ago working on storing nested sets in the RDBMS and it wasn't pretty. And the RDBMS schema for Magento 1, with key-value tables all over the place which NoSQL would have removed the need for.

Can you define what "nested sets" means more specifically?

Postgres supports hierarchical/nested structures using the "ltree" column type. There is nothing stopping you from defining a primary key of (eg) "set1.set10.set100". There is also support for recursive views/etc to operate on these kinds of sets.

Again, if you have some kind of "sparse" column, it can make sense to put that into a JSONB column. This is effectively the same thing as attaching an unstructured document to a record for this use-case.

Sure: http://mikehillyer.com/articles/managing-hierarchical-data-i...

Joe Celko popularised them. I've been unable to find when they were first introduced; but a search of my source code archive points to having written one ~14 years ago.

> so long as there's some JOIN support, via referencing nested documents vs direct nesting)

Which has its own problems. PG does this just fine, with a full battle-tested relational system to back it (and you) up.

> with key-value tables all over the place which NoSQL would have removed the need for.

Product X having a stupid schema is not a good basis for an argument for or against a particular product.

> Product X having a stupid schema is not a good basis for an argument for or against a particular product.

What other way could Magento have implemented user-defined columns at the time, using a RDBMS? In 2009 when MongoDB was released, JSON columnstores were something to dream of and the alternative was storing serialised data in a BLOB field. That "stupid schema" did not have an alternative I can think of, except NoSQL.

I'm not sure if those were questionable defaults, or questionable design decisions which were the only option at the time, and now persist as questionable defaults.

I'm pretty sure that mmap was the only storage engine available for MongoDB for most of the hype period.