Hacker News new | ask | show | jobs
by adityapatadia 672 days ago
Almost all statements about MongoDB are wrong.

> You know exactly what your app needs to do, up-front

No one does. Mongodb still perfectly fits.

> You know exactly what your access patterns will be, up-front

This one also no one knows when they start. We successfully scaled MongoDB from a few users a day to millions of queries an hour.

> You have a known need to scale to really large sizes of data

This is exactly a great point. When data size goes to a billion rows, Postgres is tough. MongoDB just works without issue.

> You are okay giving up some level of consistency

This is said for ages about MongoDB. Today, it provides very good consistency.

> This is because this sort of database is basically a giant distributed hash map.

Putting MongoDB in category of Dynamo is a big mistake. It's NOT a giant distributed hash map.

> Arbitrary questions like "How many users signed up in the last month" can be trivially answered by writing a SQL query, perhaps on a read-replica if you are worried about running an expensive query on the same machine that is dealing with customer traffic. It's just outside the scope of this kind of database. You need to be ETL-ing your data out to handle it.

This shows the author has no idea how MongoDB aggregation works.

I don't want fresh grads to use SQL just because they learn relations (and consistency and constraints and what not). It's perfectly fine to start on MongoDB and make it the primary DB.

7 comments

> This is exactly a great point. When data size goes to a billion rows, Postgres is tough.

You’ve been led astray. You can handle a billion rows on a developer laptop, let alone a production grade instance.

Depends on row width. 10KB JSON fields are all too common.
I have more mongodb experience than postgres but my impression is that a lot of the json handling I ended up doing in mongo would have been easier/reliable in postgres
Yes, you obviously can’t fit 10 TB of data onto a developer laptop. However you’ll run out of disk before you run into Postgres issues.
With an SSD cache + a few external 20TB hard drives I think you can easily make it to 40TB with redundancy.
Yes, you obviously can fit 10 TB of data onto a developer laptop with 10tb of external storage. However you’ll run out of disk before you run into Postgres issues.
I was kinda agreeing with you :-D
> This is exactly a great point. When data size goes to a billion rows, Postgres is tough. MongoDB just works without issue.

Is it though ? Maybe 5-10 years ago it was.

It is still true that vanilla Postgres doesn’t scale well beyond multiple machines. There are extensions that help, though.
My point is that you can handle a billion of rows on a single PostgreSQL instance.
Looking at the sheer number of rows isn’t really helpful - you’d need to know the query profile. Any database can simply store a billion rows.
> This is exactly a great point. When data size goes to a billion rows, Postgres is tough. MongoDB just works without issue.

Personally, I've not seen any application that seriously needs a billion rows in a single table. (except at truly massive scale, but then you're not using Mongo)

The real solution is implementing archiving to a file store like S3 and/or ship it off to a data warehouse. You don't need billions of rows in a `record_history`/`user_audit` table going back 5 years in your production database. Nobody queries the data.

May be we are the odd one here but we need that data at millisecond latency (no those are not logs, we use ClickHouse for that)

Just wanted to put here that it's possible to scale Mongo to this level.

Disclaimer: I have a lot more experience with postgres than mongo. I have worked with multi billion row databases in postgres. I have not on mongo.

> When data size goes to a billion rows, Postgres is tough. MongoDB just works without issue.

Joins are tough at a billion rows in Postgres. PK lookups and simple index queries of the type mongo is good at Postgres is generally good at too. The main thing mongo has over postgres is ease of sharding if one is looking to scale horizontally.

>> When data size goes to a billion rows, Postgres is tough. MongoDB just works without issue.

Our everyday problems... Tbh when you reach that size you will hopefully already have a dba department no matter what you use.

We don't have a single DBA or DevOps or SRE. MongoDB is really that simple.
What percentage of projects hit a billion rows?

I guess one could write a lot of extra rows to try and get there.

> We successfully scaled MongoDB from a few users a day to millions of queries an hour.

Uh, 1 query per second is 60x60x60=216000... Soo, 1 million queries per hour equals 4-5 queries per second.

Soo, that's not even at toy project level. That's extremely low scale, like the smallest possible instance small.

A consumer laptop does 20+k queries/seconds on postgres, mysql etc. a raspberry pi usually still gets 1-3k read queries/s, depending on the used SD card (Or 432 million queries per second).

You're not instilling any kind of confidence quoting numbers like that

60x60x60 is 60 hours, not 1 hour. 1 hour is 3600 seconds. therefore 1 million queries per hour equal ~280 queries per second.
Oof, you're right. Still within the performance profile of a raspberry pi though, even if it's no longer off by an order of magnitude

So I think my point still stand: that number is as low as you can get for any rdbms.

TBH we are at around 15M queries per hour. I am sure our customers don't want us to run on RPi. Btw, it's not only query but billion+ rows which are also there.
I expected as much, usually it's me pointing out that mongodb is a decent DB depending on the data you're ingesting/storing, and it's builtin clustering is significant better then what postgres offers at the moment.

But the number was so low I couldn't help but point out that this was more likely to convince me that mongo is a joke then a usable database

Your math is wrong… and it cannot be assumed that traffic is uniformly spread.

Finally what you’re saying is orthogonal to MongoDB - you can self host Mongo on a raspberry pi.