|
|
|
|
|
by mixedbit
4858 days ago
|
|
But I suspect the join must have been over an indexed column, so it did not touched 4bln rows, otherwise 2-3 seconds would be hard to believe. The group by query in the article must access all 3bln rows, which makes a huge difference. |
|
I remember it well, because I was trying to explain why having tens of gigabytes of indexes wouldn't help them much if they only had 16Gb of RAM.
In terms of group-by performance, it depends a lot on the kind of data and how it's stored. For example, taking a sum on a columnar store is quite amenable to parallel solutions and a lot of databases will do that way.