Hacker News new | ask | show | jobs
by pdonis 33 days ago
By "request" I didn't necessarily mean the HTTP request sent by the client, and I don't think the post I was responding to did either. But I agree my use of terminology was ambiguous. Let me restate more clearly, and in a way that shows the issue under your request/response process:

An HTTP request comes in with a certain idempotency key. The server returns 202, as you say, and begins to process the database transaction.

While the server is still procesing the database transaction, a second HTTP request comes in with the same idempotency key. What response does this second HTTP request get? The original transaction that the first HTTP request triggered hasn't succeeded and hasn't failed, so it doesn't fall into either of the categories in the post I responded to.

Your answer is that the second HTTP request gets a 409, which makes sense to me, although others are objecting to it.

1 comments

Best practice is to keep database transactions short. There is no value in returning a "hey this is in progress" error code to a client when transactions are short, and databases don't easily give you this primitive anyway.

You seem very focused on long-running orchestration type systems. You build these on top of basic transactional primitives, but it's a mistake to try to make the whole process a single transaction. You can have a quick, transactional "start process" operation which must be idempotent. Other operations like "check status" need not be so complicated.

You don't necessarily share the idempotency key between the "start process" request and the "check status" request. You could for convenience, but it isn't necessary, and on balance most APIs don't. This is the "client picks ID" vs "server picks ID" design choice.

> There is no value in returning a "hey this is in progress" error code to a client when transactions are short

Fair enough. So basically your approach is to wait until the first request completes to decide how to respond to the second request that came in with the same idempotency key.

However, that would seem to me to imply that when the second request comes in, you check its idempotency key, realize you've already received a request with that key and you're processing it, and don't do anything else with the second request until the first one is completed. In particular, you don't have the second request trigger the start of another transaction.

But elsewhere in this thread, you've said you would start a second transaction based on the second request, and let your database's transaction mechanism tell you that it's a duplicate when you try to commit it. Why would you do that if you've checked the second request's idempotency key and you know it's a duplicate?

> You seem very focused on long-running orchestration type systems.

I'm not focused on anything except getting what I thought would be a simple answer to a simple question. The above seems to provide that (though it still leaves a question open, as above). That's all I wanted.

> You don't necessarily share the idempotency key between the "start process" request and the "check status" request.

I'm not talking about a "check status" request. The scenario I've been asking about all along is when a second "start process" request comes in with the same idempotency key as a previous "start process" request, while the process is still in progress.

While I'd love to take credit for it, this isn't literally "my" approach; this is just the standard way that transaction processing systems on the web work. And it's so standard that you don't even have to write code for it. Databases handle all the isolation and concurrency issues for you.

Part of the problem here is that we're confusing how do you structure the API (replay? 409? something else?) with how we implement the API. The original article (and my original response) focused on API structure. We're wandering into the details of implementation, which is fine, but there are of course many ways to do the implementation. Some simpler than others.

Here's the simplest and most reliable way to implement idempotency for a trivial "create payment" operation, where the client submits an idempotency key. This pattern is incredibly common. Every request looks something like this:

* Start a transaction

* Lookup "does this idempotency key already exist"

* If it doesn't, insert the payment record with the idempotency key

* Commit the transaction

* Return the result. Successful insert is always 200OK. "key already exists" results in either replay of the original result (Stripe model) or an explicit error like 409 (my favored approach, still ubiquitous in ecommerce, and very common in financial APIs that predate Stripe).

Does that help? If you're using your database to handle concurrency, you need every request to start inside the transaction. You can't check the idempotency key outside of the transaction or you can't guarantee once-and-only-once behavior.

[Before someone mentions it, yes you can use a unique constraint instead of an explicit transaction, and this is conceptually identical - the check-for-dup transaction is inside a single INSERT]

> Does that help?

What you said up to that point didn't really. But then you said this:

> If you're using your database to handle concurrency, you need every request to start inside the transaction. You can't check the idempotency key outside of the transaction or you can't guarantee once-and-only-once behavior.

Which answers the question that what you said earlier in your post raised. If I'm understanding you right, "lookup the idempotency key" is also relying on the same database, so you need the whole operation to be inside a single transaction in that database.

> Part of the problem here is that we're confusing how do you structure the API (replay? 409? something else?) with how we implement the API.

It would seem to me that you would want "what happens if a second request comes in with the same idempotency key while the first is still in progress" to be part of the API, so clients would know what your server is going to do in that scenario.

Nobody cares. These are short lived transactions (generally milliseconds); collisions are a rare edge case; it's fine to block. One request succeeds, the other gets a dup error (or a replay).

You could invent your own more sophisticated idempotency API but good luck finding someone that wants to implement it or use it. What real-world problem are you trying to solve?

> Nobody cares.

Meaning, clients don't care about the thing I asked about?

> What real-world problem are you trying to solve?

I'm trying to understand your answers to my questions. When there seems to me to be something missing, I ask about it.