Hacker News new | ask | show | jobs
by phamilton 1028 days ago
We got hit with a good bit of the postgres iceberg this week.

VACUUM by default runs with the TRUNCATE option. This will, if possible and valuable, truncate the heap and return those pages to the operating system. This operation takes an AccessExclusive lock, which blocks reads and writes. It is written to be very deferential and only runs if nothing else is trying to take a competing lock.

AccessExclusive is a unique lock in that it is replicated to standbys. When the writer holds the lock, reads on the standbys will be blocked. However, the writer doesn't know about them, so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.

Finally, if vacuum truncate is interrupted by lock contention it must repeat a scan of the heap to ensure the blocks are still empty before proceeding with the truncation.

All of these details combined meant our writer got stuck in a "grab lock, scan, be interrupted, release lock, grab lock, restart scan, be interrupted, release lock... etc" loop. The replication of this lock to the reader seemed to have batched together, so the readers never got to run in between the loop iterations. The vacuum never actually succeeded because it kept re-scanning. We had to intervene manually to disable autovacuum and cancel the in-progress jobs.

We've hit plenty of weird postgres issues, but this was a new one for us.

Edit: of note is the somewhat rare lack of clarity in the postgres docs. "VACUUM (without FULL)" is documented as only requiring ShareUpdateExclusive, which is why autovacuum is considered safe to run. Turns out that's not true.

2 comments

> so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.

How is contention on the standbys communicated back to the writer? I'm confused about how the writer is interrupted mid-cycle if only the readers are experiencing lock contention.

The writer also has traffic that can attempt to acquire the locks. Reads and writes (though our highest volume read traffic is reader only).

Contention from the standbys do not get communicated to the writer.

I love PG, but things like this make me appreciate Mongo's compaction and replication which can run interleaved with reads and writes.