Hacker News new | ask | show | jobs
by abhinav22 1773 days ago
SQL is an important skill to have, in particular one should learn about database normalization and the normal forms - very important stuff!
3 comments

And even if you don't want to normalize, at least don't make the DB basically incomprehensible.

Speaking from experience, a "abnormalized" db sucks. https://news.ycombinator.com/item?id=27842820

Yeah one should know why it’s important to normalize, and then when they go a different path, it is done consciously and not because they are inexperienced and didn’t really know any better.

Folks out there who don’t want to learn the ins and outs of database theory - please don’t; read up on it, it’s a very important skill & knowledge to have and doesn’t take long to master. It does take a few days, perhaps weeks, for all the ideas to “click” in your head.

Whilst we having many fads in IT, there are certain sound principles of computer science developed over 40-50 years which one should learn :-) Know the rules before you break them!

For Databases, it takes me a concious effort to not go to at least 3rd NF.

Except for addresses/names/telephone numbers, I have been burned by ANY assumption I made. Those, I tend to just store as something like char(255) nowadays by default, unless business requirements force me to split it up, but even then, after voicing my concerns.

Edit: reading up on the addresses falsehoods again, 255 may not be enough.

https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...

https://github.com/google/libphonenumber/blob/master/FALSEHO...

https://www.mjt.me.uk/posts/falsehoods-programmers-believe-a...

If you use Postgres or SQLite, TEXT is preferable for strings.
I just wanted to give a easily understood example. Specifically to point out that even limiting the length can make problems.
There's reasons for one wide table.

The traditional SQL database uses row based data structures.

When you shift to columnar data formats for the table, parquet or any columnar DB, the normalization rules which were developed for row based become extremely different.

The brave new world now is in-memory DBs.

All those 1970s rules, which make sense for row based tables stored on disk, don't apply to data in RAM.

At all.

So it's going to get very interesting.

Shouldn't the normalization rules be exactly the same whether the data is stored row-oriented or column-oriented and on disk or memory?
This is a common misconception.

Columnar stores do indeed enable wide tables but they don’t replace the need for normalization or the benefits. Wide tables and normalization (or denormalization) solve different problems.

From the perspective of a database the difference between RAM and disk is latency. Normalization is still a factor in query performance.

> There's reasons for one wide table.

There are indeed. It should still be a conscious decision. For the DB I work with, it wasn't.

SQL is a query language. Data normalization is a relational design methodology. They aren’t actually related except that relational databases often use SQL and benefit from normalization. You can normalize data models without SQL and you can use SQL to interact with denormalized data.
All of it is completely irrelevant unnecessary complexity. Learn to use Tinkerpop.
Curious that 99.99% of applications use an SQL database, must just be a lot of really bad developers. Am I so out of touch? No, no, it’s the children who are wrong.
You can patronize me all you want, it does not rectify the fact that you are out of touch with modern database technologies and prefer what you know works.