Hacker News new | ask | show | jobs
by SigmundA 2594 days ago
It is a correctness vs performance trade-off. I typically use foreign keys and the type system as much as possible and back-off when needed for performance or certain cases of flexibility.

Unfortunately unlike compiled typed languages the database foreign keys are just straight up runtime overhead.

Using types correctly such as storing numbers and dates and UUID's in the actual database type rather than strings is an obvious one that typically improves space and performance, there is normally not much trade off here. Varchar with specific length I go back and forth on, it has saved me off on and catching something not validated for length properly in the app, but does get in the way when needed to expand length which happens quite often.

For coded values I like a code table per code type with foreign keys rather than say EAV, this has saved me many times in data quality and allows extra attributes to be added to specific types. It matches more closely to the type system in the client too which will have a specific type for a specific coded value. It also allows reflection like qualities where you can query the schema and follow the types. The downside is in a large app you can have many hundreds or more of code tables and its more work to cross cut features, but the benefits outweigh the downsides for me. this also goes for more full blown "entities" which are more obvious.

I don't normally use more complicated constraint or triggers, the tend to get in the way things like ETL loads and many complex rules end up working out much better in the app code. I am torn about it, but the procedural languages in the DB is usually less expressive with less reuse and can't do the other things needed such as alerting messaging etc.

Bottom line though don't be dogmatic, use the database as you can, test performance make the trade-offs. Sometimes constraints and triggers are better due to their locality to the data, sometimes leaving off foreign keys is better due to the overhead. Sometimes EAV is better for flexibility (or JSON).

In an ideal world we would run the same code in any tier(client, app server, database) for this stuff choosing the best location to run it based on the needs (or run it in multiple tiers, client for user experience, database for transactional correctness). You can almost get there with javascript an PL/V8 now days, and sort of with .Net and Sql Server, but its really not there yet.