Hacker News new | ask | show | jobs
by arp242 748 days ago
You don't know what the "right indexes" are, because sometimes "no index" is the "right index". Sometimes because a full table scan is faster. Sometimes because you're okay accepting the various performance trade-offs (e.g. insert speed vs. update speed, storage space on disk).

Many applications don't have tests for every single last trivial SQL query, and adding those just because the SQL server may decide to bail out because it might perhaps possibly could be 100ms slower is not a good way for most teams to spend their time.

In the end it's just trading one confusion for another confusion. But the current confusion has a lot less complexity overall, so that clearly the "better" one IMHO.

2 comments

> Just to turn off the possibility of table scans on that individual table in all environments

I do not remember the name but there was a sorta pg-compatible database started a few years ago that intentionally did not support non-indexed queries and if you tried a query for which a index was available it would create it before running the query (and keep it live for some time).

I thought it was an interesting tradeoff (even if likely not the one I would choose) a signigicant lag spike followed by better performance rather than a uniform degradation over time.

...yet people flock to NoSQL to avoid the complexity of having to learn SQL. Or stick JSON into a single table in SQL, because using more than one table is too complex. And so on. (This is my context where I work, if you don't have to deal with this and people around you happily embrace and learn the footguns and complexities of SQL then good for you..)

Perhaps you like this idea better, a table-specific hint:

    create table MyTable (

    ) with (assume_infinitely_large=on)
Just to turn off the possibility of table scans on that individual table in all environments. That way you don't have to do this in ALL situations, only the ones where you know that full table scans in production will be out of the question.

> any applications don't have tests for every single last trivial SQL query, and adding those just because

Another perspective here though:

If code does not have test coverage, what really happens is the code gets tested (or "hardened") in production. And isn't it then better to have things crash straight after deploy, than to have it gradually and silently degrade as you get more users?

> people flock to NoSQL to avoid the complexity of having to learn SQL

That is not my impression. It went through a hype cycle as many things do, before it settled down to "where it makes sense", which was quite a while ago. And sticking JSON in SQL can be perfectly fine.

"NoSQL" is not magic, and requires just as much tinkering if you have large amounts of data to get a decent performance. Or it has performance characteristics geared towards very specific operations.

And none of what you're proposing will fix anything about the difficulties of running a (SQL) database; it will only make things more complex, error-prone, and difficult.

> If code does not have test coverage, what really happens is the code gets tested (or "hardened") in production. And isn't it then better to have things crash straight after deploy, than to have it gradually and silently degrade as you get more users?

Of course not. What a silly thing to say. Deploy to production → run migrations that are not easily reversible → SQL refuses to run "because bruh huh" → customers angry because downtime → your day is well fucked → fix issue → look up the ugliest words in a dictionary for the fucking idiot cunts who made your application crash even when it could have worked → double-check dictionary again to make sure you haven't missed any words.

Have you even run a production service? With users? Who will should at you if it doesn't work because their business is on its arse? And having to scramble to fix it? Perhaps at 4am?

And things don't "silently degrade" if you monitor it, which you should do for serious services anyway as I mentioned. PostgreSQL has pretty good facilities for this built-in, but it's easy enough to collect metrics in the application.

Or users report "it's slow" and then you investigate. Or you get errors in your error log because things time out. Or your server's CPU is pinned to 100%. You can get by even without directly monitoring the DB.

> Have you even run a production service?

Yes, for many years I was in the core team responsible for a service important enough in my country that if it's down for 30 minutes it makes the national newspapers. Some million users.

And main lesson from that experience is: If you are going to fail, make sure you fail as fast as possible. Then failure happens during work hours and you can usually do a simple rollback (1) to the previous version of the service -- sometimes that rollback will even happen automatically if the failure happens quickly enough.

The worst cases and longest downtimes came from performance problems and/or suddenly changing query plans that only crept up on us slowly and perhaps hit during traffic spikes (which in our case would happen during holidays).

--

(1) Yes I know you said in your example you did something non-reversible in between. But our rollouts would often be through flags and % of traffic, not so much code version. Also, in practice with our traffic volumes, either failure would be soon enough that you didn't have time to do that other non-reversible thing in between before you went down, OR if it happens "seldom" it can just be down until you are able to roll forward; still less disruptive to get the problem right away than to suddenly get it after a year.

I guess YMMV. Again what I'm proposing is an optional hint, so if you don't do gradual rollout of traffic on new features, if you don't have high test coverage, etc etc one could simply not use it.

But I know for sure it would be useful in our specific context.