Hacker News new | ask | show | jobs
by simonw 1398 days ago
This post is not about database cursors. It's about the style of pagination where you have a ?_next=xxx link to get to the next page, where the xxx bit encodes details about the last item on the current page such that the next page can show everything that comes after that record.

This is also sometimes known as keyset pagination. My favourite technical explanation of that is here: https://use-the-index-luke.com/no-offset

3 comments

Sounds great on paper but my experience is that key-based indexing is broken on SOLR and likely other lucene engine DBs. If you are evaluating conditions on child objects inside a parent document, the child objects get split and stored as a separate document that is joined… and if that child document is in a different block/file on disk, then it won’t necessarily be inside the range being scanned, and so you will be missing some results that meet your logical criteria.

Possibly just an implementation error in the BlockJoinParser but it did not occur with numeric pagination.

In order to work with that, you need to “flatten” your json, like with the @JsonUnwrapped annotation, and some structures (like arrays) may become problematic and/or require significant lexical mapping of queries to the dataset.

parent-child in lucene is a hack. It only works when the children immediately follow the parent, and that never happens when it's in a different file because you updated the child. It's a minor miracle you didn't notice the error with numeric pagination.
Most users don't care but I very much prefer to have URLs with `/?page=123` instead of `?_next=xxx`. I should get over it but meanwhile I'll stick to offset pagination with deferred join (see aarondf's comment).
This post is about "database cursors" and "keyset pagination". In practice, these terms refer to the same thing, one seen bottom-up the other seen top-down. Implementation-wise, one saves the state of the cursor in the pagination [parameters] and resumes reading from the DB with an equivalent cursor.
> these terms refer to the same thing

No, cursor is this https://en.wikipedia.org/wiki/Cursor_(databases) https://www.postgresql.org/docs/current/plpgsql-cursors.html

I once did pagination using database cursors, which is something different than keyset pagination: The server would keep a cursor open and keep fetching more data from the same query. This enabled the system to have an interface similar to offset pagination (you get the first page, then the second page, etc) but without doing a new query for each page discarding the first n-1 pages per query

The downside is that it makes the server stateful, and doesn't scale (you would need to keep hundreds of cursors open if you had hundreds of simultaneous users)

Words can have two meanings. Cursor pagination and key set pagination do indeed refer to the same thing.

database cursors are a different thing.

You are agreeing with the parent comment with a tone of disagreement.

They never said what "cursor pagination" is or used the term at all. In this chain of comments, yours is the first one to use the term.

The comment you're replying to, like the original comment in this chain, merely said that this post is not about using "database cursors" for pagination. Which is correct, and you agreed that this is a different thing.

Yes words can have two meanings, but given the similarity of context between these two meanings, calling this cursor pagination is not a great idea. It screams of someone that didn’t know about database cursors (which is only one implementation method) trying to describe a method for web site pagination. I’m not blaming the author here for this, as they likely know the difference. But for a new developer trying to Google this, it will be very confusing.