Hacker News new | ask | show | jobs
by _acco 998 days ago
Author here. Good point. For those that are curious, parent is referring to the following situation:

1. Transaction A starts, its before trigger fires, Row 1 has its updated_at timestamp set to 2023-09-22 12:00:01.

2. Transaction B starts a moment later, its before trigger fires, Row 2 has its updated_at timestamp set to 2023-09-22 12:00:02.

3. Transaction B commits successfully.

4. Polling query runs, sees Row 2 as the latest change, and updates its cursor to 2023-09-22 12:00:02.

5. Transaction A then commits successfully.

A simple way to avoid this issue is to not poll close to real-time, as the order is eventually consistent.

Perhaps a more robust suggestion would be to use a sequence? Imagine a new column, `updated_at_idx`, that incremented every time a row was changed.

1 comments

Sequences kind of have the same issue, because you don't know if a gap is because of a rollback or an uncommitted transaction. Though with some logic you can do a pretty good job at this with sequences. And then you're not in the realm of "simple" anymore, at all.
Any ideas for a simple polling implementation that's more robust?
It's not exactly simple as it involves some postgres specific knowledge, but you can make it reliable when working with transaction ids (see https://event-driven.io/en/ordering_in_postgres_outbox/).
I've had pretty much the exact same problem and what I went for in my low-volume case was to simply add advisory locks such that I can guarantee the transaction start times provide correct ordering.
Set the trigger to add the primary key + change time to a separate table, then scan/truncate that table to poll changes.