|
|
|
|
|
by Sirupsen
1292 days ago
|
|
I've added this to the article, thanks! Composite index (int64, int64): ~70 MiB in Postgres, ~350 MiB in MySQL Single index (int64): ~70 MiB in Postgres, ~240 MiB in MySQL If you assume the majority of an index are index entries of (int64, int64, int64) where the third number is some sort of identifier for the record on the heap, you'd expect this to be ~230 MiB. So Postgres does some compression very well here, and MySQL has a bit more overhead for its indexes it seems. |
|
https://www.postgresql.org/docs/current/btree-implementation...
> 67.4.3. Deduplication
> A duplicate is a leaf page tuple (a tuple that points to a table row) where all indexed key columns have values that match corresponding column values from at least one other leaf page tuple in the same index. Duplicate tuples are quite common in practice. B-Tree indexes can use a special, space-efficient representation for duplicates when an optional technique is enabled: deduplication.
> Deduplication works by periodically merging groups of duplicate tuples together, forming a single posting list tuple for each group. The column key value(s) only appear once in this representation. This is followed by a sorted array of TIDs that point to rows in the table. This significantly reduces the storage size of indexes where each value (or each distinct combination of column values) appears several times on average. The latency of queries can be reduced significantly. Overall query throughput may increase significantly. The overhead of routine index vacuuming may also be reduced significantly.