Hacker News new | ask | show | jobs
by Groxx 1404 days ago
While I generally wonder the same question... I've seen basically every database follow this pattern. There are fairly frequent scenarios where doing a couple separate indexed queries performs better than an equivalent join. Particularly when queries get kinda-large and up.

I can certainly see it when running into memory limits per table, because doing the full join might require more memory at once, but it frequently happens far below where I'd expect that to occur (in the low thousands of rows). Dunno why. Maybe it uses caches more efficiently? Many simple operations are often more cache-able than a few complex ones.

1 comments

Maybe disk spill because some joins require a superlinear amount of space and doesn’t fit in memory buffers?