Hacker News new | ask | show | jobs
by grandalf 3234 days ago
Many API developers misuse/overload HTTP status codes when they should actually be using application specific status codes delivered via a wrapper to the response.

Any time an API returns a correct response the HTTP response code should be 2xx. Many developers use 4xx status codes to indicate things like data validation errors or other things that are not part of the HTTP transport.

HTTP is the transport layer, and any application specific scenarios are best handled with a custom error namespace that can be returned within any 2xx HTTP response.

Generally speaking, if the HTTP layer is returning 5xx you have a server problem and client code should not know anything about that or do anything but retry. If the HTTP layer is returning 4xx then the client is likely poorly designed or misconfigured.

But for typical client operation, when there is no server error or client design/configuration error, HTTP responses should be 2xx or 3xx and any additional detail should be handled in an application-specific way, not by overloading the meaning of HTTP response codes, which are for transport-related concerns only.

6 comments

No.

You should never, ever, ever return a 2xx status code with an error payload. The 2xx series means "Success" and you're abusing HTTP to use it to mean anything else.

With regard to application layer vs transport layer, the 404 Not Found, 406 Not Acceptable, 409 Conflict, and many others errors are specifically application layer codes. If you can't get a more specific code, 400 Bad Request with an expressive payload is useful. The 500 codes are generally transport layer.

The reason that APIs are getting easier and easier to understand and use is because we're using the same design patterns consistently and repeatedly. Therefore, the understanding you gain from the Twilio API can be applied to Stripe. If you use existing tools in new and "innovative" ways that are contrary to - instead of in addition to - to their intention, you're setting back adoption and making your users re-figure out basic things. Don't.

Or worse, people don't realize how you're misusing the tools and build flawed systems.

(The one exception I'd give you is 202 Accepted which is generally used for "we've accepted this for now but processing and final validation may occur later" - it's a tentative success message.)

> You should never, ever, ever return a 2xx status code with an error payload. The 2xx series means "Success" and you're abusing HTTP to use it to mean anything else.

If the transport was successful, then it was an HTTP success.

> 404 Not Found, 406 Not Acceptable, 409 Conflict, and many others errors are specifically application layer codes

These are "application specific" to the resource only, not to the API as a whole. Nor are they particularly informative which is why the spec advocates incorporating a more specific error message. At best, these codes describe a class of errors, possibly useful for logging, but not sufficient for a nontrivial API's error semantics.

My point is not that the HTTP status codes don't mean anything, it's that they are misused to describe application level stuff and used to handle application layer flow control in a way that defeats the purpose and makes them more confusing than they are worth much of the time.

> because we're using the same design patterns consistently and repeatedly.

This is absolutely not the case, every team overloading HTTP status codes seems to do it their own quirky way.

The worst part is when people start bikeshedding about the subtle meaning/intent of HTTP status codes (as your point about 202 illustrates). The simple solution is to define the relevant application layer errors for a specific API and use the HTTP status codes for transport related stuff. It should not matter what 418 or 202 means because the subtle meanings they get int he context of one API/application should not be relevant.

Uh what? If the request was appropriately formatted, not too large, and generally acceptable to the server from a transport perspective, why shouldn't it succeed with a 200?

Error handling is an application level concern, not a transport level concern.

HTTP status codes lack the nuance required to inform the end user about errors in any non-trivial application. I personally enjoy the OP's solution. I've worked on dozens of REST APIs in my life, and I always prefer those that contain a response envelope complete with metadata and error responses.

Pre-emptive edit: When I say error handling is an application level concern, I don't mean "application" from an OSI model perspective, which is narrower in scope than what I'm talking about.

Expressive error messages are great. Calling errors a "success" is wrong.

Check out my slides here for more: https://speakerdeck.com/caseysoftware/12-reasons-your-api-su...

Slides 38-40 are specifically about expressive and useful error messages.

Your slides illustrate a custom response payload which happens to use codes that numerically match HTTP status codes. You also use a response message, I think we actually agree. I'd recommend using a different numerical namespace to avoid any possible confusion about the semantics of those codes. Just because they sound the same to one person reading the spec doesn't mean they will make sense to others.
Understood on that point. The response codes are in those payloads purely for illustrative purposes.
>appropriately formatted

If the server is expecting data that it doesn't receive (because the form field was left blank) then can you really consider the request "appropriately formatted"? Is it in any way meaningfully different from the case where the server receives a request on /api/invalid-url/ and doesn't have a corresponding resource on its end?

By that logic nearly all errors are formatting errors.
I disagree. HTTP is a widely known standard with a ready to use client/server implementation in basically any platform. Leveraging that semantically in API seems to improve productivity all around. Are there any major trade offs?

The alternative is making standards for how to format the payload and try to get everyone on board while also trying to manage another half a dozen or more legacy implementations of non-standards. Or don't make standards and live with the fact that every API will do its own thing and you have to handle everything differently each time you have to talk to anyone.

As a practical example, I would really prefer not to have to parse a XML response to get a status code in my js frontend, nor would I'd like to examine it via string/regex matching.

> Are there any major trade offs?

Yes, see my other comments in this thread.

> HTTP is the transport layer

No, HTTP is the application layer. TCP is the transport layer.

> and any application specific scenarios are best handled with a custom error namespace that can be returned within any 2xx HTTP response.

No, generally application errors should be 4xx or 5xx codes; greater detail can be provided in the payload, sure, but a proper error condition should be returned, 4xx for errors resulting from the client request being unacceptable in some way, 5xx for other errors (which are, necessarily, server errors.)

(If you are tunneling another application-layer protocol over HTTP, your argument makes sense, but that's not the general case with APIs.)

> (If you are tunneling another application-layer protocol over HTTP, your argument makes sense, but that's not the general case with APIs.)

I'd argue that that is actually what most APIs are doing that are not purely "REST" operating naively on resources.

As we move up the protocol stack, one is the transport for the next. HTTP is not really the application layer protocol, it's a transport for a protocol defined by the API, at least for any API nontrivial enough to benefit from its own specific error/response codes.

    Many developers use 4xx status codes to indicate things
    like data validation errors or other things that are not
    part of the HTTP transport.
You mean like a form validation error? Surely an invalid request is a bad request. No?
agreed... especially because it is an error and if i return a 2xx then the frontend code will then have error logic in the success logic why have two checks when i can have one..
> Surely an invalid request is a bad request.

Invalid to whom? Should a form submitted with a username that already exists in the system get a 500 response code? What's the server error?

I'd go with 409 Conflict, but I'd also accept 400 Bad Request

Incidentally, the RFC for 409 makes it quite clear that this touches the application layer:

> Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the representation being PUT included changes to a resource that conflict with those made by an earlier (third-party) request, the origin server might use a 409 response to indicate that it can't complete the request. In this case, the response representation would likely contain information useful for merging the differences based on the revision history.

https://tools.ietf.org/html/rfc7231#section-6.5.8

In any case, getting pedantic about what's "an error" and using 200 OK for everything that didn't fail at the transport layer is a super frustrating experience regardless of whether or not it's semantically correct. Please don't do that to consumers of your API.

Versioning a resource, per WebDAV is a very specific application.

> using 200 OK for everything that didn't fail at the transport layer is a super frustrating experience regardless of whether or not it's semantically correct

The semantics of my application's protocol do not necessarily mirror the semantics of HTTP, nor are the descriptive statuses semantically similar, since most HTTP statuses are simply about transport (even though a few touch on the "application" of URIs as resources).

Transport layer to me means TCP/IP. HTTP sits above that, in the application domain. If a user sends an invalid request then the response should be in the 400 band. If the server failed in some way to deal with the request (e.g. the database is unavailable), then the response should be in the 500 band.

I would typically decide what my response code should be depending on what I think the response should be.

If a user tries to register a new account then a successful outcome would probably be a new user resource. If the user tries to register a username that already exists, I'd also probably go with 409 Conflict as the user is trying to create a resource that already exists.

There's no mention of WebDAV in that RFC.

Would it kill you to send back a 400 status code upon an error? Having to check for { isError: true } or something similar is obnoxious. I don't see the value gain of responding 200 OK when the operation was not successful.

> when the operation was not successful.

Well, do you care about all the successful and unsuccessful aspects of TCP that underlie the connection? No, you just care that it was successful, allowing the HTTP request. Assuming that is successful, then your API "operation" can occur and return its result.

Some APIs I've seen just use 400 for all generic client-side errors, including request syntax errors, impossible requests, duplicate requests, etc.

I would argue that most of the time, for any sufficiently large application, you'll need to use application specific status codes anyway (as you said), so why bother trying to be specific with the HTTP error codes? Certain client-side applications parse out if the response is a 2xx, 3xx, 4xx, or 5xx, and log it differently. At which point you just need one of them to trigger the different logging behavior.

The only special case I can think of is 401, which you need to send to trigger the basic authentication pop-up window for most browsers.

Exactly.
You could use 422.
422 is from the WebDAV spec, which is not HTTP.

A big portion of the improper overloading of HTTP status codes comes from WebDAV status codes seeming appealing when in fact WebDAV is a very specific set of functionality that is not really analogous to the way most REST APIs work... notably WebDAV offers locking semantics.

> WebDAV spec, which is not HTTP

HTTP is not a closed protocol. There will never be a single document defining all HTTP methods, status codes and such. Nothing in the definition of status code 422 makes it inapplicable to non-WebDAV applications. It is as much standard HTTP as status code 400 is (both are “proposed standards” in IETF terms).

> HTTP is not a closed protocol.

I'm not arguing it is, it's an extension that adds specific semantics to accomplish a specific purpose.

But some developers look at WebDAV and see a lot of similarity with some of the application specific error conditions they are working with and decide that overloading HTTP status codes is a good idea, when it rarely is.

Why do you think that?

An application-level failure is a "server problem". It's possible there is nothing between the OS handling TCP and your application.

This is also against the concept of REST and traditional HTTP servers, which do use HTTP error codes.

> It's possible there is nothing between the OS handling TCP and your application.

This does happen, and it results in the client having to have ad-hoc code to deal with a nearly infinite number of possible "Server error" scenarios, some of which may be normal functions of the API and others which may actually be server failures.

> It's possible there is nothing between the OS handling TCP and your application.

I'm not arguing that one shouldn't use HTTP status codes, I'm arguing that they should be used only for the standard meaning of the code, one should not need to consult a table provided by the API designer to determine which 4xx HTTP status codes warrant a retry, and which 5xx status codes are normal vs exceptional situations.

Suppose you make a call to an API and it returns 5xx because you provided a parameter value that is valid in the URL or payload but invalid in the app (suppose your app doesn't allow usernames that are profane words, for example). The response should not be 5xx, it should be 2xx with a message indicating the application's preferred range of values. 5xx means that something went wrong on the server trying to fulfill the request, 2xx means that the request was fulfilled properly, and any additional info relevant to the client should be passed along as part of that 200 response.

> This is also against the concept of REST

Very few APIs can be designed well as 100% REST. Lots of very typical scenarios necessitate jumping through a lot of hoops to use REST in a pure way. REST fights with database normalization in some cases and with many authentication scenarios. It can also require the client to maintain a fairly elaborate (and brittle) representation of server-side state that is mostly unnecessary.

REST makes sense when there is a perfect alignment between REST verbs and data flows, but when you get into a situation where it's necessary to make multiple API calls to do one logical operation, it would probably have been simpler not to use REST in the first place and to have just designed a simple, clean, non-REST API.

> Suppose you make a call to an API and it returns 5xx because you provided a parameter value that is valid in the URL or payload but invalid in the app

That is wrong. Client mistakes should get 400-level error codes. A 200-level code indicates the request completed successfully, which it didn't.

There is no difference between an "HTTP server" and an app. They are often the same thing, like when using Apache to serve files from disk.

> I'm not arguing that one shouldn't use HTTP status codes, I'm arguing that they should be used only for the standard meaning of the code

Of course you should use the error codes correctly! Don't return a 200 code when the request failed.

> Don't return a 200 code when the request failed.

The request didn't fail, it simply followed application logic and returned a successful response indicating the nature of the application logic to the client. That is not an error condition, it's just application logic.

And said logic determined your input was invalid, HTTP 400 Bad Request is a perfectly reasonable response.

If you want to use HTTP as nothing more than a transport layer and return 200 OK's all day then fine, but why are you even using HTTP at that point?

Unless I'm doing some sort of pure REST API I'm simply using HTTP as a transport protocol for my application specific protocol.

If HTTP delivers my application protocol's message successfully, it's an HTTP 2xx, but my application's protocol may have a range of statuses that are not "transport-like" in the way HTTP status codes are, which are the main things that API consumers care about, assuming the basic transport is working.

Every web server I've ever used returns a 500 server error when the transport succeeds but it encounters an exception in application level code. Are you really proposing that's incorrect behavior? I also can't imaging a use for 403 Forbidden and 401 Unauthorized that isn't "application level" logic.

If you think about it pretty much 99% of requests the server receives could be correct if the "application logic" was implemented differently. You can't really decouple the two in any meaningful way.

If you want to return 200 OK status codes all day when why are you using HTTP in the first place?

(Also, HTTP is an application protocol)

So then if my application logic catches a thrown exception because I don't want my app to crash every time a server bug is exposed, I should be returning a 200 OK response rather than a 500 Internal Server Error, because the app logic accounts for handling exceptions and returns a response.

I can't disagree more. It just seems like clients would have a nightmare of a time debugging client-side code against such a system.

If it was an error you expected, then you should handle it using application logic. If you use a programming language with exceptions and one gets thrown and you use it for flow control, that doesn't mean the sever is broken.

The API consumer doesn't need to think there is a problem, since you might have ways of handling it in a more subtle way. Maybe when your DB connection times out you send an application specific message that includes the number of milliseconds to wait for a retry. You don't have to let the shit hit the fan and return a 5xx any time something slightly exceptional occurs, 5xx is for things that are truly exceptional and for which you have no helpful information for the consumer about what to do.

While I'm inclined to agree with you the makers of SOAP don't seem to share your opinion. Any SOAP fault must be send with an HTTP 500 according to the spec.
My take on SOAP's approach is that it isn't overloading HTTP status messages, it's attempting to impose SOAP's message structure on things that are legitimately HTTP 5xx responses, so that the soap client can handle some of those failure modes and the programmer doesn't have to engineer another protocol for dealing with transport errors or parse a different message format.

Notably, SOAP is not just doing REST it is attempting to offer an abstraction for dealing with transport issues.

Most people designing a REST or rpc style API do not attempt this breadth in their API design, instead they just overload HTTP response codes (from the HTTP and webDAV specs) to handle application-specific stuff, which ends up resulting in the codes being effectively meaningless across applications.

SOAP, on the other hand, is meant to be used in the same way across applications. So if your app uses SOAP to consume multiple SOAP APIs, and you engineer failure handling for the HTTP 500 associated with some kind of SOAP fault, chances are you can reuse this logic with all of the SOAP APIs.

On the contrary, many REST APIs that overload HTTP statuses do it in some arbitrary way defined by the team working on the API, so the consumer can't make generalizations about the meaning of the error codes in the common cases where one might determine (for example) a retry to be necessary.

> Most people designing a REST or rpc style API do not attempt this breadth in their API design, instead they just overload HTTP response codes (from the HTTP and webDAV specs) to handle application-specific stuff, which ends up resulting in the codes being effectively meaningless across applications.

HTTP status codes exist to be overloaded, especially in the 4xx range. As per the IETF: https://tools.ietf.org/html/rfc7231#section-6.5

> HTTP status codes exist to be overloaded

If you look at most of the status codes they pertain to the transport itself: content negotiation, content size, etc.

In other words, the client would be pleased to see one of those codes when making a request and wondering "Why is this not returning the data I expect?".

Compare this to a typical scenario in an application of signing up a user. If the user types in a password that does not meet the password complexity requirements or chooses a username that is already taken, there was not a problem with the transport mechanism, there is an application-specific constraint that the programmer calling the API did expect to encounter some of the time, because of course both of these are common scenarios.

If for some reason the application code does not limit usernames to a small number of characters and the API returns an HTTP 413 status, that is unexpected to the programmer, so the non-200 code is appropriate. The programmer needs to rethink the transport assumptions assuming that super long usernames are OK.

The distinction is that HTTP errors apply to HTTP itself. Are the basic requests and responses functioning properly, etc. But application-specific things generally belong in a separate layer of metadata.

I've seen some very very bad API designs that totally misuse HTTP status codes, largely influenced by the old jQuery pattern of a success and failure callback, with 2xx triggering the success callback and 4xx and 5xx triggering the failure callback. How many times has some server error resulted in confusion because the error callback is being used to handle routine flow control in the application code.

I'm relatively agnostic about this, but having developed integrations for many APIs I generally prefer those that use HTTP status codes + error payload for client side errors, rather than 200 OK + possible error payload, as they're much easier to deal with on the client side.

There are no unexpected status codes if your API is fully documented. I gave a short talk about this last year: https://www.youtube.com/watch?v=9w4dNi2wu_E&feature=youtu.be...

> I generally prefer those that use HTTP status codes + error payload

I have less of a a problem with that approach, but I do think that it can be confusing if the consumer reads too much into the meaning of the HTTP status codes. In general, sometimes the HTTP status code adds useful information but ideally the application specific error modes would cover all non-transport error conditions nicely.

> There are no unexpected status codes if your API is fully documented.

Totally agree with this, documentation is the most important thing.