| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ddek 1896 days ago

I've used cursor pagination on daemons doing large batch processing. It worked fine.

If you're using a cursor, you should be iterating the entire (or at least most of) the result set over the course of the connection. I wouldn't think they're appropriate to API's, because maintaining the cursor state between requests sounds painful.

Offset is the only way to paginate arbitrarily sorted data sets, unless you plan on going hard on indexing. The performance falls off as the offset grows. You'll also have issues with apparent 'duplicates' between pages.

Keyset pagination is preferred if the data is usually sorted in one way. Most infinite scrolling falls into this category. Each row has a unique 'counter', which is (usually) monotonic. To skip pages, filter rows whose counter is later than the last row in the current page. This is fast and has no duplicates. However, your sorting options are constrained, as you must have an index for each sort option.

Keyset and offset can be combined to eliminate the duplication defect of offset alone. The performance issue will still be there. This becomes an engineering decision - in most use cases, the large offset perf hits don't really happen (aside from abuse, which almost always happens).

1 comments

andrewingram 1896 days ago

Important to call out here, but thanks to GraphQL’s unfortunate habit of using words which already have established meanings, when a lot of people talk about “cursor-based pagination”, they actually mean keyset pagination and not something that requires a server to maintain a stateful cursor.

I’m guessing this is the meaning the OP was referring to.

link

inshadows 1896 days ago

I think it's common to call it cursor regardless of where it's maintained. I agree that it may confuse some people since there are cursors in SQL APIs. But the underlying concept is the same. Keyset does not convey the specific meaning.

link