Hacker News new | ask | show | jobs
by chris_wot 4073 days ago
No clustered index? What about CLUSTER?

http://www.postgresql.org/docs/9.4/static/sql-cluster.html

CLUSTER instructs PostgreSQL to cluster the table specified by table_name based on the index specified by index_name. The index must already have been defined on table_name.

When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation: when the table is subsequently updated, the changes are not clustered. That is, no attempt is made to store new or updated rows according to their index order. (If one wishes, one can periodically recluster by issuing the command again. Also, setting the table's FILLFACTOR storage parameter to less than 100% can aid in preserving cluster ordering during updates, since updated rows are kept on the same page if enough space is available there.)

When a table is clustered, PostgreSQL remembers which index it was clustered by. The form CLUSTER table_name reclusters the table using the same index as before. You can also use the CLUSTER or SET WITHOUT CLUSTER forms of ALTER TABLE to set the index to be used for future cluster operations, or to clear any previous setting.

1 comments

Yes you can manually re-order a table, with locking, but that's really not the same thing as enforcing in on INSERT/UPDATE.
Clustered index is something it would be nice for Postgres to have, but it would be a lot of work to implement and it's certainly not always a performance win. Looking up by that index is very fast, but looking up by a secondary index may have to traverse a second btree (it cannot point directly at data on disk because its location is tied to the clustered index). And of course there's quite a bit of overhead for transactions and modification operations.

In practice, you can recover many of the advantages of clustered indexes in PostgreSQL (along with the disadvantages) with a covering index eligible for https://wiki.postgresql.org/wiki/Index-only_scans.