| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by scarface74 3367 days ago

A. Real joins using normalized data

Then you end up denormalizimg your data anyway in the business layer anyway because most real world applications work with denormalized data and being a good DDD person you work with "aggregate roots". With Mongo you can store and load the whole aggregate root as is without the whole "object relational impedance mismatch".

1. generates a ton of network congestion from getting the data 2. ton of cpu because because json decoding/encoding 3. ton of cpu because scripting languages vs optimized C

I don't use scripting languages. I use C#.

4. the use of O(shit) algorithms to make the joins

If you're doing a lot of joins with a non relational database, you're doing it wrong. You should be Store the whole object hierarchy as one document.

If I'm doing joins. I'm using Linq which should be doing joins based on hashes.

5. inconsistencies when things fail like network or disk or shutdown

Again you should be storing the whole document with the relationships.

B. Redundant data 1. use a ton of memory/disk 2. create deferred overheads in order to keep things sync 3. despite that, end up with inconsistent data all over the place

You're doing it wrong you're using a non relational database like a relational database.

4. wont actually be faster or scalable(!!!) In both cases they end up creating a bottleneck once you start caring about concurrent access not messing up your data.

If you use it like an RDMS it won't be. But if you're storing all of the related data as a document, using good algorithms when you have to join, it is faster. How much time do OO programmers spend denormalizing data and using bloated ORMs?

1 comments

zepolen 3366 days ago

Sounds like you're married to the fact that data should be an [object].

> How much time do OO programmers spend denormalizing data and using bloated ORMs?

OO programmers that try to make data into objects sure. Those are the same programmers that love document based datastores because they don't have to think about their data - just get a nice "object" which fits their "everything must be an object" programming mindset.

Thing is, data is relational, and data isn't an object.

It's much better to use views of data, eg. "get me what I need to display a listing page in an eshop" - which entails getting the price, photos, title, description, delivery cost etc. - in a single list that you can then use - the database can do this for you - and all the underlying details are abstracted away, better yet - your app server will never have to CARE that at some points product.photos changed from being a simple array of strings to a fully fledged table - because the view remains the same.

> Again you should be storing the whole document with the relationships.

Okay so you're saving the entire list of 'followers' of a user inside the user document

What happens when you have a really popular user with a million followers...

What do you do when you update a user who happens to follow thousands of other users...

Oh you don't do it like that in that case? I guess... you're now relational.

Denormalization only helps with performance up to a certain degree, and in fact what you save on read access you pay dearly in update access or stupidly huge extra storage/memory.

link

scarface74 3366 days ago

Thing is, data is relational, and data isn't an object

Only if you are more concerned with the data first approach instead of thinking about the domain first.

It's only "much better" until everyone wants there own one off view that has to be kept in sync with the code and reverting/branching/versioning has to keep in sync with what the hundreds of brittle views, stored procedures, etc. that are being created. Instead of treating the data store like a dumb data store. By putting as much logic in the business later as possible. I can not only branch, version, etc. I can completely unit test my whole application and mock out the data store.

Okay so you're saving the entire list of 'followers' of a user inside the user document What happens when you have a really popular user with a million followers... What do you do when you update a user who happens to follow thousands of other users...

Mongo best practices are to think about the orthoganility of your relationships - one to few, one to many, or "one to quintillions" and choose whether to embed based on that.

But going back to thinking in terms of DDD an aggregate roots, a user would be an entity, an address of the user would be a value type that belongs to the user - embed it into the document. A user's followers would be entities, that would be a related object that should be able to change independently. In that case use Mongo references.

and all the underlying details are abstracted away, better yet - your app server will never have to CARE that at some points product.photos changed from being a simple array of strings to a fully fledged table - because the view remains the same.

If I'm changing the representation of the data. I still have to change it somewhere - whether the database -- where you don't have proper versioning, branching, source control -- or the app server. If you're changing the representation in the app server in more than one module/microservice, you're doing it wrong. With a microservice/module, I can still represent the data as old clients expect by versioning the API and have much better tooling than have five different versions of the view that never die.

link

marktangotango 3363 days ago

You are wrong, pure and simple. Please post back here in 5 or 10 years, let us know how that all worked out for you.

link