|
|
|
|
|
by ngrilly
3575 days ago
|
|
I have an unrelated question :-) I read a presentation titled "Powering Heap" by Dan Robinson, Lead Engineer at Heap, which contains interesting info about how you use PostgreSQL. [1] At Heap, do you try to keep rows belonging to the same customer_id contiguous on disk, in order to minimize disk seeks? If yes, how do you it? Do you use something like pg_repack? If no, don't you suffer from reading heap pages that contain only one or a few rows belonging to the requested customer_id? [1] http://info.citusdata.com/rs/235-CNE-301/images/Powering_Hea... |
|
Currently, maintaining the clustering has only been best-effort. We sort our data whenever we copy it from one location to another and the data comes in sorted by time, so it's fairly easy to maintain a high row correlation with time.