| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dspillett 2950 days ago

This is not the same thing - it is rearranging an existing index for efficiency as a one-off process. It needs to be repeated when the data is substantially changed.

With a true clustered index the clustering property is as far as possible (it can get somewhat fragmented in the presence of random data) maintained during normal operation without the need for a full rebuild every now and then to keep the benefits for new data.

> When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired on it. This prevents any other database operations (both reads and writes) from operating on the table until the CLUSTER is finished.

This makes that operation very nasty. For a large amount of data you are looking at locking your applications out of the database for some time, and the delay is relative to the total data size in the table being acted upon, not the amount of data that has recently arrived or changed.