Hacker News new | ask | show | jobs
by mattashii 1965 days ago
Not per se _as effective_, but it will still help a lot. NULL tuples pre-pg13 take ~ 14 bytes each, and 18 bytes when aligned. (= 2 (ItemID, location on page) + 6 (TID) + 2 (t_info) + 4 (NULL bitmap) + 4 bytes alignment). When deduplication is enabled for your index, then your expected tuple size becomes just a bit more than 6 bytes (~ 50 TIDs* in one tuple => 2 (ItemId) + 6 (alt tid) + 2 (t_info) + 4 (null bitmap) + 50 * 6 (heap TIDs) / 50 => ~ 6.28 bytes/tuple).

So, deduplication saves some 65% in index size for NULL-only index-tuples, and the further 35% can be saved by using a partial index (so, in this case, deduplication could have saved 13GB).

*note: last time I checked, REINDEX with deduplication enabled packs 50 duplicates in one compressed index tuple. This varies for naturally grown indexes, and changes with column types and update access patterns.

1 comments

heh, my calculation was incorrect: ItemID is 4 bytes in size, so the calculations are slightly off:

pre-13 was 16 bytes each (20 when 64-bit compiled), and post-13 it is 6.32 bytes/heap tuple when deduplication has kicked in.