Hacker News new | ask | show | jobs
by DaiPlusPlus 3218 days ago
> 2. Understood what NoSQL meant and forgot about joins altogether.

How would you represent a simple invoicing system in MongoDB (e.g. Customers + Products + Orders + OrderLineItems )? NoSQL-for-everything advocates posit two solutions: either denormalize the data by embedding Customer information within an Order document, which also contains an array of OrderLineItems, or use a UUID as a kind-of foreign key and maintain separate relationships. Both approaches have serious problems (data-duplication and inevitable inconsistency in the first, and lack of referential integrity in the second, besides ending-up abusing a NoSQL database as an RDBMS). Is there a better way? Or would you agree that certain classes of problems are best left to RDBMS' domain?

3 comments

The example you've used (invoices) is actually quite instructive for demonstrating the benefits of a "document store." An invoice, historically, was a literal printed piece of paper. Invoices are actually really annoying to implement in an RDBMS because of so-called "referential integrity" -- an invoice should be a "snapshot in time" of everything that happened when the order was processed, so ideally, when a user views their invoices from the past 2 years, they look the same every time.

Except, oops, your user got married and moved, now your precious "referential integrity" means jack because the generated invoice is flat-out wrong. Product removed from the store? Too bad, needs to stay in the database forever for historical purposes. Prices need to change? Better design the database to handle snapshots of every product state.

If you were implementing this in MongoDB, you'd probably store a UUID and the flattened data at the time of invoice generation, that way you can still query on ids AND not deal with the headache of having a combinatorial explosion of data in your RDBMS.

You would solve this in a RDBMS the same way: de-normalize when you're saving the invoice (example: a line items table with snap shot of current item price, description, etc.)
Yes, which suggests that the "serious problems" mentioned by the grandparent aren't serious (or problems) at all.
In Postgres, you'd simply have a table with a JSON column for the snapshot-in-type contract.

You can then select fields from that JSON for invoices, reports, etc with the arrow operator:

https://www.postgresql.org/docs/9.6/static/functions-json.ht...

With SQL you can denormalize all that (and should) to create that snapshot. But with NoSQL you can't normalize and get back a way to quickly query the number of products sold per month over the last 5 years.
Yes, this is possible with Aggregation and MapReduce: https://docs.mongodb.com/manual/aggregation/
For relative values of "quickly".
Instead of nebulous terms like NoSQL you should instead just look at the damn features because these concepts are orthogonal. MongoDB has transaction isolation on the document level instead of the database level. If you can store everything in a single document then it doesn't matter. If you can't then use a database that supports database level transactions. It doesn't matter if it's a NoSQL or RDBMS database.

I feel a lot of people know that typical nosql databases (without database level transactions) are not suitable for their problem but they don't know why and then just think NoSQL is always bad and RDBMS are always better because the NoSQL databases are intended to be used for different problems.

Not the original commentor, but there are some valid cases for NoSQL: some people use it for storing massive amounts of web crawling data. But the thing here is that it's throw-away'ish, and in that case it's often not worth it to add structure (even though there pretty much is structure in everythig you look at long enough).

But I do think having any data consisting of, say, items, orders, users, payment in MongoDB is very much a bad idea. Been there.