Hacker News new | ask | show | jobs
by tomnipotent 1902 days ago
> Depends on estimated selectivity

This can't be determined with LIKE suffix wildcards and that's not how any of the commonly-used index data structures work (b-tree, hash, gist, or bitmap). Index metadata will not help in eliminating leaf pages, and every row is going to need to be scanned.

1 comments

Yes every row of the index needs to be scanned not every row of the table which is faster than scanning the table.

I am most familiar with MS SQL server and it will most certainly do an index scan for what it thinks is a highly selective predicate with "suffix wildcards" and it can return results faster than scanning the table.

If the index covers the result columns it will scan the index and never touch the table otherwise it will do a key lookup to the table.

B-tree's do not work that way. They are inherently ordered, and contain min/max that help to determine if you can skip the page for a given condition. The min/max cannot be used for suffix wildcards.

Unless the index contains all the columns you're dealing with, the optimizer will determine that just scanning the table will cost less than scanning an index AND then looking up the data in the table (bookmark lookups in MSSQL).

B-trees have little to do with it, if the table has many columns its cheaper to scan the index for the value because it occupies less pages, thats all, less I/O more cache hits etc, goes from top to bottom on the index scanning for the result. This is the distinction between scan and seek.

I just ran a common one I see and yep MS SQL is still doing a index scan then key lookup to get result with a select * from table where col LIKE '%abc' type query.