| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ivraatiems 1387 days ago

I'm going to assume that by "foreign keys", you mean "foreign key constraints" where the DB itself is insisting on particular relationships.

There are a few different schools of thought. I will list them, but the important thing to remember is not to be dogmatic. They are all right or wrong depending on your circumstance.

One school of thought says "I want all data in my DB to be normalized. I want it to be right when it goes in so it never breaks the application layer." That school would say foreign key constraints are critical.

Another school of thought says "I want all the data in my DB to be retrieved and inserted quickly. I want the application layer to do any error-checking that is necessary, or, I want to be in a situation where I can always fail gracefully if there's errors in data validation."

Still another school of thought says "I don't trust those programmers to write good application code, so I will insist on normalized data for that reason," and yet another says, "I don't have control over the DB schema, that's some DBA's job, so I will just do all my validation in-app."

The point of this being, there's tradeoffs either way you go.

Personally, I typically would rather have the application layer do the validation and even the joining of data, a lot of the time, when the application is high-volume. At the volumes my organization works with, it is very difficult to write performant SQL queries that use JOINs and other relationships as a developer - even as a DBA! - and often much easier, for me, to write performant application code. The DB is also a pet with many owners, whereas the infra for my applications is owned by my team. So, it's better for me to do relationship validation in code myself. (We also do not use a heavy ORM, again for performance reasons. Just Dapper.)

At my previous job, the situation was the opposite - we weren't under such load at most times that it mattered if the queries were performant, we had Entity Framework building relationship, and EF will blow up if you ask it to build relationships where none exist. So, we needed more normalized data, and that was what we went for. But even then, not in every situation.

3 comments

jve 1387 days ago

> At the volumes my organization works with, it is very difficult to write performant SQL queries that use JOINs and other relationships as a developer - even as a DBA! - and often much easier, for me, to write performant application code.

How can this possibly be true? Won't that result in sending unnecessary data over the wire, stressing network and SQL buffer?

What are these queries and what are these volumes? I just can't wrap my head around the performance statement. You know better join algorithms that SQL Server is capable of (Loop/Hash/Merge)? Given that you have appropriate indexes in place, perhaps query hints is what you need to control sql plan guides if you know a thing or two about your data and it's distribution more than the sql statistics.