Wow, I had no idea it could store jsonb. Any obvious advantages to using mongodb for nosql that I should consider? I’m way more comfortable with Postgres and would rather stick to that
Keep in mind that JSONB is different than JSON in PG. The later will not transform the JSON simply check consistency and has limits and performance hits.
JSONB is encoded and will transform your JSON, it will drop duplicate keys and order might not be preserved. But it's also faster and easier to use (IMO).
There is no real reason to use MongoDB tbh, PG JSONB can be handled like any other field and you can even index into JSON (partial indices even; only index rows where a field is present in the JSONB)
Warning: I haven't used MongoDB in several years, so this info can be outdated!
The main advantage of using JSONB on Postgres over MongoDB is that you can create tables that mix regular fields (varchar, number, date, etc...) with JSONB fields.
Then you can do joins of your table with other tables (no need to map/reduce or other insane processing for a simple join).
That actually sounds really cool. Honestly, even if Mongodb has all those features, I would much rather do everything on postgres seeing as I'm a hundred times more familiar with it than mongo.
I'm a big fan of PostgreSQL, but don't you think MongoDB can be useful when you outgrow a single machine (vertical scaling is not possible anymore) and you need a sharded cluster (and you don't need joins and transactions...)? This is the only situation where MongoDB makes sense, maybe. But even though, I'd probably look at Citus instead.
If you have objects that are happily sharded arbitrarily across machines, don't need joins, transactions, or aggregates calculated on the server, then what are you getting over something like (for example) open-source Redis Cluster?
I ask, because if I spend a few hours designing in advance, and write a bit of code, I can get Redis to do much of what I need in such scenarios (counters, aggregates, statistics, indexes, queues, ...), and being that I wrote a book on Redis, task queues, object mapper, well, I'm going to use that instead (and use some of the public domain / open-source code I've already written).
Also, with my work on real Redis transactions (which I've made work across Redis Cluster) means that I don't even need to give up ACID transactions in Redis, regardless of scale.
Once I need more; in the form of post-hoc analysis, joins, group-by, aggregates, etc., at scale; either I can easily export from Redis into logfiles to csv/tsv/json for Spark, Python + Pandas, and/or Redshift if I've got the $, or at the same time just use pgloader into Postgres and live there.
I haven't mucked about with Postgres foreign data wrappers much, but there is a Redis one available, so maybe I can even drop that Redis -> S3/csv/tsv/json, and get everything I want (direct data structure manipulation in Redis + everything Postgres has).
So yeah. I generally solve my problems with a bit more design in advance, and MongoDB doesn't really have anything to design for/against; you get objects and indexes. Which are usually not as good as Postgres equivalents (Postgres json objects are better than MongoDB, just by themselves, and I'm not the first/only person to say it). And what I get from Redis (raw data structures, 1 million ops/second/core) means that for cases where other folks may use MongoDB, I use Redis. Then I use Postgres for basically everything else.
So yeah, I don't use MongoDB. Postgres for almost everything, and Redis for the cases where Postgres doesn't feel like quite the right fit.
> If you have objects that are happily sharded arbitrarily across machines, don't need joins, transactions, or aggregates calculated on the server, then what are you getting over something like (for example) open-source Redis Cluster?
The only reason I can see to prefer MongoDB over Redis Cluster in this case (no joins, no transactions, no aggregations) is if the dataset doesn't fit in memory. Except that, I think you're right to prefer Redis.
Your comment is a really interesting comparison of Redis and MongoDB. Never thought about that before. Thanks!
Don't get me wrong, I'm sure there are use-cases for MongoDB for folks that are not me.
Because background is important; my doctorate is in Algorithms and Data Structures, so Redis basically fits the problem solving algorithms I've been building in my head since before I learned of Redis in 2010. And Postgres (or really any good relational database) is that conceptual next step which took me the better part 2 years of daily SQL (after 8+ years of occasional SQL) to really appreciate.
1. MongoDB is different from PostgreSQL in more ways than sharded clusters.
2. Pg has variants (e.g., Postgres-XL) and extensions (e.g., CitusDB, as you mention) and methods (e.g., postgres-fdw, pgBouncer etc) that let you keep using quite a lot of Pg features with your data horizontally distributed across machines.
3. If you still want automatically managed sharding, there are quite a few databases (SQL: CockroachDB and NoSQL: Cassandra, FoundationDB) better than MongoDB.
jsonb functionality in PG is fantastic. It does everything youd expect, plus you can do a lot of other nifty things with it that you normally cant do with relational dbs very easily.
JSONB is encoded and will transform your JSON, it will drop duplicate keys and order might not be preserved. But it's also faster and easier to use (IMO).
There is no real reason to use MongoDB tbh, PG JSONB can be handled like any other field and you can even index into JSON (partial indices even; only index rows where a field is present in the JSONB)