Hacker News new | ask | show | jobs
by 10x-dev 1375 days ago
Are joins in a 5NF database now as fast as querying a denormalized database?
5 comments

Maybe the joins are faster? It's really hard to tell without more context/detail.

I think many were burned by mysql back in the day - trying to use sql as a document database - or using php frameworks that happily did a hundred queries pr page view.

As a general rule of thumb, for a REST app - I'd say the db should be normalized, and the cache layer(s) can handle the denormalization.

Ie when you get /page=1 varnish can spit out a response from ram (which if you squint, is a denormalized projection of your data), or it can go talk to your app, that talks to the db. And the latter is most likely fast enough (tm).

Maybe I'm missing some context, but isn't that true by definition even if the db does nothing special? You either spend time sending N queries and waiting for responses, or join and use one query. Given actually matching scenarios for both, the one with less communication overhead wins.
In a normalized database that's true, but in a denormalized database, by definition, you get a third option, which is to have tables with redundant data that can be returned in a single query (as if it were pre-joined, I suppose).
Depends. Denormalized means the database contains redundant data. If a query have to scan 10x or 100x as many rows due to redundant data, it is obviously going to be slower. But it is hard to say anything general since denormalization will make some queries faster and other queries slower.
with good index you will not scan more rows.

But each query will use a different copy of the same data instead of joining with the same copy.

Storing both copy in memory take more space so you can’t cache as much in memory.

I’m not talking redis or memcached but the page cache inside the sql engine.

they always been faster! When you have 5NF, the database is smaller and all the row you join will be in memory in the SQL server PageCache.

While when using denormalized database, your read will have to go to the disk.

Seq scans will be faster in a normal form database, if you're seq scanning then joining other tables on an index it might be faster. Otherwise the denormalized table will probably be faster.