Hacker News new | ask | show | jobs
by skywhopper 2108 days ago
Lots of interesting info, and I really like working with MongoDB. But I am baffled by the claim that "MongoDB is the king." In all the circles I work in, I only hear Mongo dismissed as a joke. Unfortunately the company's dismissiveness of RDBMSes, their hubris in pushing NoSQL, and their blunders over what are extremely poor default settings all combine to make MongoDB something I don't see anyone taking seriously. I use it in a project where it's basically just serving as a big cache, so the reliability and durability of the data is not critical. It's certainly way easier to query than any other document-oriented database, but getting a foothold to use for anything requiring long-term storage would be a major challenge.
2 comments

I interviewed recently at a payment provider that is rewriting its PHP/Mysql monolith in Java & go microservices with MongoDb.

The architect would praise static typing but would prefer MongoDb "because it's easier to add a column". It felt weird but I've never used MongoDb so I could not really argue about it.

Sounds like a bad architect or someone who only knows surface features and doesn't have experience with the actual databases.

Adding a column is not hard in any relational database, and pretty much all modern ones support no-downtime transactional schema updates with backfills, concurrent index builds, etc.

Also the schema always exists somewhere, and it's usually to put it in the database so it's next to (and validates) your data rather than keeping spread out in your application code.

Adding columns can be hard at scale. MongoDB allows you to add columns incrementally without stopping the world. You just add the value and it's materialized in the DBMS. The point is you don't have to manually define schema but instead the database works with the structure of the data.

I'm a dyed-in-the-wool RDBMS user but I can see the value of this feature. The trade-off, of course, is that your application has to handle varying schema levels. MongoDB also won't protect you against typos, inconsistent types, and other foolishness. I would not judge people who choose to make this trade--it's a sensible one for many use cases.

Many analytic databases are headed in this direction due to the amount of data that arrives in the form of nested JSON structures. I can't speak for other DBMS types but it's something we're very interested in for ClickHouse.

Isn't adding a column even easier in column-oriented OLAP databases?

I agree that more complex row formats are needed though. Bigquery has done well with nested/repeating structures and Snowflake uses the PAX data format for JSON which has been very useful (however their JSON/VARIANT column doesn't support structured types).

It's definitely easier than conventional RDBMS. Adding a column in ClickHouse is just a metadata operation. BiqQuery has great nested structure support.

Still, it's hard to beat MongoDB in this respect. In my first app I was amazed that I could just insert a BSON object and MongoDB created a queryable table automatically. You pay for it of course in other ways but the ease of use is quite extraordinary.

you can use JSON schema to lockdown you schemas like SQL but few developers choose do that as they prefer the flexibility that MongoDB offers. Changes in the schema are almost always due to upstream changes so the schema is a follower rather than a leader.

Encoding your business logic in the database schema is rarely a good idea.

> Encoding your business logic in the database schema is rarely a good idea.

If by this you mean not defining tables with explicit columns in general I would disagree. It's one of the most effective ways to get speed in analytic applications because it optimizes compression. And in OLTP applications it's the best way to ensure consistency of data.

But perhaps you were referring to a narrower scope like just JSON & upstream changes?

No typing is great whether its BSON types in MongoDB or column types in RDBMS. Anything past that generally gets in the way of flexibility to respond to new business requirements. Business changes rarely originate at the schema level so its a bad place to encode business semantics.
The comparison here is DB schema + DB schema migration + shared data access layer vs shared data access layer With that I am not clear on what you think the problem is without domain knowledge.

I can understand broadly that RDBMS is better when you have joins and foreign key constraints and nosql is cheaper “web” scale plus there are ACID considerations yet without domain knowledge it’s plausible nosql is better

"Web scale" isn't a real thing and I don't know what "shared data access layer" refers to but relational databases have decades of features and tooling that prove to be very useful in most cases. Even if you don't use ACID, transactions, key relations and other features, maintaining the basic type information in the database still has numerous advantages in data consistency, integrity and performance.

Also everything can scale, and it's all using the same fundamental primitives (sharding, etc) to do so anyway. Some just make it easier with built-in functionality vs external layers.

This is not to say that non-relational systems aren't useful, but that they are rarely used correctly instead of chosen for marketing hype, and hearing things like "adding a column is easier" from an architect usually points to the later situation.

Database schemas only validate the data type and whether it's null or not.

It's next to useless for ensuring data integrity and why every app you see will have data validation inside the code itself. Whether it's checking that an email has a valid structure or that a payment is not negative.

Majority of the logic will be the code so it makes no sense to me other than if you have multiple clients accessing that database why it must be enforced at the database level as well.

The examples you give are perfectly possible in any decent SQL server (mssql, postgres). If anything, database servers are made to ensure data consistency, checks like "payment not negative" are no brainers. If you don't define these constraints at the database level you are leaving its power on the table and are re-inventing the wheel by putting it in your application layer somewhere.
RDBMS can enforce data integrity through normalization (avoiding update anomalies), are you familiar with these? https://www.postgresql.org/docs/13/ddl-constraints.html

That doesn't mean that treating your database as a dumb datastore and having the "smarts" in the application layer doesn't work better for some applications. Both ways have trade-offs.

The shape of a record, property types, foreign keys and more are all critical part of a schema.

Business logic validation is a separate layer but still requires proper types and data integrity underneath, and that's where strong schemas in the database help. More so when you have multiple apps interacting with the same database.

In companies with DBAs the constrains are baked into the DB, in places without DBAs (or weak DBAs dev relation) that is absorbed into the code.

To be fair depending on the culture of the company it maybe easier to put in code and update it there than to wait to have it approved, discussed and scheduled by the DBA

Oh man. Payment provider data in mongo... even if by now MongoDB doesnt lose data... you REALLY want your ledger or OLTP to be in a relational, transactional database.
Coinbase uses MongoDB, Barclays uses MongoDB, BBVA uses MongoDB, Capital One uses MongoDB. Charles Schwab uses MongoDB. FICO uses MongoDB. Goldman Sachs uses MongoDB. HSBC uses MongoDB. Intuit uses MongoDB. Uk Inland Revenue uses MongoDB. UK Dept of Work and Pensions uses MongoDB.

This is the tip of the iceberg when it comes to MongoDB's use cases in Financial services. These organisations REALLY use MongoDB for financial data and they are all public references. Relational databases are great for financial data and so is MongoDB.

Stripe started with mongodb https://www.mongodb.com/blog/post/mongodb-powering-the-magic...

Don't know what they use now or if they still use it.

IIRC Stripe relies heavily on Mongo
I am biased but you are definitely missing out by just listening to your friends on this one. These days MongoDB is pretty mature and the “MongoDB sucks” meme is getting pretty long in the tooth.

It turns out it takes a decade to build a new database that’s half decent and has all the features people want. It’s really hard! Ask anyone that’s tried. The path is littered with skeletons.

Of course, there are those databases that are “perfect” from the start and never made any mistakes but is anyone talking about them today? Even Postgres gets it wrong sometimes.

> Even Postgres gets it wrong sometimes

Imagining a timeline where people would say that about MongoDB is a good way to assess the project's quality...

MongoDB Inc. is still in the business of lying to its users through its teeth: https://news.ycombinator.com/item?id=23499658

And that's just a publicly available example. I have a client who paid for a MongoDB Inc. to send an "expert" down to assess the viability of a project, who flat out said the project can't be done and left it there; a week later the official MongoDB Inc. report says "We can definitely get it done. Why don't you move to our managed MongoDB Atlas service? That'll be $10K." For the record, my professional assessment was also that it'd be impossible to do it on a data model like Mongo's.

----

> Of course, there are those databases that are “perfect” from the start and never made any mistakes but is anyone talking about them today? Even Postgres gets it wrong sometimes.

1. The snark isn't helping your cause.

2. Then there are databases that claim to be "perfect" from the start, having always put up a facade without every admitting any of their own flaws. Like MongoDB.

The thing MongoDB does best — though not something a DBMS can be judged by — is marketing. Not just the marketing they push themselves — "MongoDB is web scale!", "90% of RDBMS use can be replaced by MongoDB!" — but also the marketing it can get its fans to push.

I think the core issue here is not the quality of marketing , but simply the fact that it exists. People need some database, and nobody is spending money on advertising Postgres, so they don't know anything about it.

Also Mongo wins by a large margin on "how little you need to know to get started" with any relational db.