Sequences kind of have the same issue, because you don't know if a gap is because of a rollback or an uncommitted transaction. Though with some logic you can do a pretty good job at this with sequences. And then you're not in the realm of "simple" anymore, at all.
I've had pretty much the exact same problem and what I went for in my low-volume case was to simply add advisory locks such that I can guarantee the transaction start times provide correct ordering.
Woah, that's news to me. Is that true even if triggers are used to update a column?
CREATE OR REPLACE FUNCTION
update_updated_at_function()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER
update_updated_at_trigger
BEFORE INSERT OR UPDATE ON
"my_schema"."my_table"
FOR EACH ROW EXECUTE PROCEDURE
update_updated_at_function();
END $$;
Is it possible for two rows to have `updated_at` timestamps that are different from the transaction commit order even if the above function and trigger are used? It's alright if `updated_at` and the commit timestamp are not the same, but the `updated_at` must represent commit order accurate to the millisecond/microsecond.
To confirm your fear, you can't use the updated_at timestamp as a proxy for commit order. The commits happen in a different order, and can be arbitrarily far apart, like hours or days depending on how long your transactions can last.
now() is the timestamp the transaction began at. There is no function to return the commit timestamp because you have to write the value before you commit.
For polling, instead of updated_at, I use a _txid column that gets set by a trigger to the current transaction ID. Then, when polling, use txid_current() to see which transactions have committed and which haven’t. It’s a little dicey and super easy to hit fencepost errors, but it’s been running smoothly in production for a few years.
1. Transaction A starts, its before trigger fires, Row 1 has its updated_at timestamp set to 2023-09-22 12:00:01.
2. Transaction B starts a moment later, its before trigger fires, Row 2 has its updated_at timestamp set to 2023-09-22 12:00:02.
3. Transaction B commits successfully.
4. Polling query runs, sees Row 2 as the latest change, and updates its cursor to 2023-09-22 12:00:02.
5. Transaction A then commits successfully.
A simple way to avoid this issue is to not poll close to real-time, as the order is eventually consistent.
Perhaps a more robust suggestion would be to use a sequence? Imagine a new column, `updated_at_idx`, that incremented every time a row was changed.