| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wulczer 5515 days ago

That would be nearly useless in my opinion.

If you have a table with four rows in it, would you like the database to give you errors when you do a full scan on it?

What you want is an option to throw errors when a query takes too long to execute and most products already have this in the form of a statement timeout parameter.

On a separate note, the balance between "late" optimization and "premature" optimization is thin indeed... These test systems need to run with the same data and on the same hardware as your main database to give meaningful results, so it's generally unavoidable to do some kind of performance tweaking on the already deployed code. And of course as the data distribution and its amount changes, you need to keep on optimizing...

1 comments

bryanmig 5515 days ago

The point is that when your table has 4 rows in it, it doesn't matter if you do a full table scan. Wait until you deploy your app and that table now has 4,000,000 rows in it.. then you have a problem.

In MongoDB (and maybe others too) there exists an option to fail when doing a table scan so that you know immediately while developing that there exists a potential for things to go awry at some later point. You can fix that now by either re-thinking your data structure, re-thinking your query, or adding appropriate indexes.

MySQL's slow query log is good for reactive development instead of proactive development.

I think its an indispensable feature.

link

St-Clock 5515 days ago

Errr... With a typical RDBMS, doing a full table scan depends on the current stats (e.g., number of rows, are the stats current, state of the index, possibility to use an index, presence of cached data, previous query performance etc.). It does not depend only on the query and the data structure, so it is near impossible at development time to be certain that a table scan will occur during production. In fact, a table scan might occur today and not tomorrow for the same query, but with different data.

Many developers have been burned by trying to optimize their queries and indexes too early. The query planner, although sometimes unpredictable, is generally better than developers at predicting the performance of a particular execution plan. Table scans are also preferable in certain situations.

Relying on the fact that RDBMS have sophisticated query planner (as opposed to MongoDB) is not a sign of "reactive development". It just means that you should test your queries (and your indexes) with close-to-production data, and, as the data grows and evolves, continue to test, because the best execution plan might change.

link

silon 5515 days ago

Slower query using indexes is better than randomly using table scans depending on statistics, ...

At least for interactive apps, NoSQL wins.

link

wulczer 5515 days ago

There are cases where a full table scan is faster than an index scan, and there are cases where the index scan is faster. There are even cases where the only option is a full table scan!

I still don't see what a setting that gives errors when a particular access method has been chosen would be good for.

You don't care about the plan, you care about how fast the query runs.

You'll always have to do reactive development when trying to get good performance. There is only so much simulating and testing you can do and real data volume and distribution changes all the time.

link