Hacker News new | ask | show | jobs
by adontz 1998 days ago
I believe data export and/or backup should be a separate API, which is low priority and ensures consistency.

Here we just see regular APIs are being abused for data export. I'm rather surprised the author did not face rate limiting.

1 comments

Coming from a REST perspective, I wouldn’t implement a separate API, I would use HTTP semantics (eg headers or, if truly necessary query params) on the resource listing to indicate the export/sync intention. Likely with an Accept header. If pagination is still preferred/required, the service could return an ETag or some other continuation token which when provided in subsequent responses could be used to indicate the consistent snapshot being requested. Since this is entirely optional, clients could use this mechanism to opt into stable/parallelizable requests (as I described in less specificity in another sub thread).

At this point, it these requests are expensive you have an opportunity to use a very simple (and optimistic) cache for good faith API users, relegate rate limiting to prevent abuse of cache creation (which should be even easier to detect than just overzealous parallelism), and even use the same or similar semantics to implement deltas for subsequent export/sync.

I hardly imagine consistent integral paginated data view without creating a snapshot. I would be manual MVCC implementation or something. Separate API seems a much simpler solution to me.