Hacker News new | ask | show | jobs
by hn_throwaway_99 40 days ago
To be honest, I liked your original response about returning a 409 - it's not something I'd done before and I like how it keeps things simpler.

But your follow up responses here are making me rethink. Now you have to have all these special cases where the original request is still in process. I think or assertion of "99% are simple POST operations" is bullshit. For the times where idempotency is hard and really matters, often times you're calling a third party API, like a payment processing API.

I would think a better approach would be to always return a 409 on a subsequent request, regardless of whether it passed or failed, and then have a separate standard API that lets you get the result of any request by its idempotency key.

2 comments

Idempotence already requires some client thinking if you want to conform to HTTP specs.

I.e. idempotent DELETE with proper protocol behavior requires that one request see the 200 OK or 204 No Content and the other sees 404 Not Found, because the delete has already happened. It would be misleading to say 200 OK to both, because that answer means the resource was there when the request arrived.

Honestly, the whole HTTP resource model has a different conceptual backing for state management than the independently developed "idempotence" concepts in distributed systems. Those non-HTTP concepts came from more message-based rather than resource-based architectural assumptions.

The cleanest mapping in the spirit of HTTP would be that you do multiple round trips. A POST creates a new idempotence context, a bit like "start a transaction". The new URI is the key for coordinating state change and allowing restart/recovery.

As I remember it, the idea of idempotence keys in headers really came from the SOAP RPC mindset. It's kind of funny to see it persisting in some hybrid SOAP + REST mental model.

> The cleanest mapping in the spirit of HTTP would be that you do multiple round trips. A POST creates a new idempotence context, a bit like "start a transaction". The new URI is the key for coordinating state change and allowing restart/recovery.

I think that gave me "Enterprise Java Beans PTSD". I.e. an over-engineered solution that adds complexity for both the client and server in the name of some sort of "protocol purity".

People bolted on idempotent semantics onto HTTP because it wasn't provided natively by the protocol, so I don't think it makes sense to go through some hoop-jumping gymnastics for the sake of conforming to a spec that doesn't describe the necessary semantics in the first place.

FWIW, I'm not particularly fond of HTTP. But there is PTSD in both directions. Doing random things ignoring (or subverting) "protocol purity" often create disastrous effects when they haven't considered how the larger system will behave when you have various middleware bits that are essentially obeying different protocols while superficially claiming to use an interoperable standard.

When I let myself ruminate, it irks me that we all let HTTP become the defacto "internet protocol" just because of firewalls. Because there was a cargo cult idea that HTTP is benign and so one of few ports allowed almost everywhere, we do stupid contortions to squeeze every protocol through an HTTP tunnel.

These short-sighted acts of laziness accumulate into HTTP everywhere. And of course, the firewall is nearly pointless when "everything" is going through that one hole anyway.

> special cases where the original request is still in process

This isn't a special case, and it's the same problem if you want to replay the original response on conflict. If the original request isn't complete, what are you going to replay?

> If the original request isn't complete, what are you going to replay?

Who says you have to replay? If you get a second request with the same idempotency key, and the original request is still in process, why not just send the client a response that says so?

It's not a terrible question. It would be complicated to implement and isn't more useful than using the database's consistency model, so nobody does it that way.

Long running transactions create all sorts of problems, so transactions are generally expected to be short. The actual work behind "create payment" or "create order" is generally fairly trivial - more or less insert a row in a table. There's no good reason to make the API complicated... you either "win" at concurrency or you lose, and the difference is generally sub-millisecond. The only meaningful thing you need to communicate to the client is "you're done" (for both the win and lose cases) or "you need to try again" (for the "something unexpected went wrong" case).

Complicated workflows can certainly have multiple steps, with "fetch the current status" calls in between. But somewhere near the beginning of every complicated workflow there will be a call to "create workflow" and it will need to have sort of mechanism which allows clients to call it idempotently. Otherwise you end up with multiple starts.

I've literally received duplicate products in the mail because of this kind of problem. I've also sent multiple products in the mail because services I relied on didn't offer the necessary idempotency mechanisms.

> The actual work behind "create payment" or "create order" is generally fairly trivial - more or less insert a row in a table.

It's generally insert a row in someone else's table, over the wire, 50ms+ away. They might not even be using an RDBMS.

That is my point. When you are doing "normal" idempotency where you do the appropriate locking and keep around a table with ongoing request status and the result that you can return on a subsequent duplicate request, you handle all these cases. But in your "409" version of it, you haven't really saved much complexity on the server because you still need to keep around all that info if you're not just returning a 409 if you get a second request while the first is in progress.
I don't understand what you're saying. I can take any cheap rdbms, put a unique constraint on a column, and make my API return 409 for conflict vs 200 on success. There's so little code involved that it's embarrassing to charge money for it.