Hacker News new | ask | show | jobs
by simonw 1998 days ago
Intersecting data is fine provided there's a unique ID for each result that can be used to de-duplicate them.

Ideally I'd want a system that guarantees at-least-once delivery of every item. I can handle duplicates just fine, what I want to avoid is an item being missed out entirely due to the way I break up the data.

1 comments

It's more than just de-duplicating, tho. Imagine you query a dataset and get something like a page count and a chunk size. That page count cannot be trusted if the dataset is mutable. If an item is inserted at the beginning of the set, you're going to miss the last item.

Pagination is hard

For dynamic usecase, DynamoDB has implemented pagination with something called lastEvaluatedKey - https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

This is different from LIMIT in RDBMS

Wouldn’t this pattern solve the complexity you are talking about?

That's one way, for sure. You can do this with IDs, dates, etc.