Hacker News new | ask | show | jobs
by that_james 1430 days ago
> People rarely assume there’s an API contract in error messages

But then what's the point of an API contract if it's not describing the returned data? What I'm arguing is the opinionated payload provides a lot of the same value. Or am I missing something?

The monitoring systems would naturally not be stoked to see 2xx codes containing errors, but I'd just ping the monitoring system out of my application server anyways. Not sure if that's better or worse though.

I've done like little to no platform engineering and I may be gravely underestimating the consequences of doing this, but it works well with prometheus.

Perhaps I am a fool though :) but how else would I find out if I didn't put my ideas out there :D

1 comments

Not who you're replying to, but I have input.

  But then what's the point of an API contract if it's not describing the returned
  data? What I'm arguing is the opinionated payload provides a lot of the same 
  value. Or am I missing something?
I guarantee you that you have not actually contracted your error messages. You have a typo, you'll change the phrasing to better describe the problem, you'll need to add more info for ancillary problems, etc. I doubt you'll bump the rev on your API when "The resource you requested was not found" gets tweaked to "We could not find anything by that ID!" in the name of 'friendlyness', but that's what you would need to do for me, as a client, for me to not have to ship a hotfix because you changed your error contract and now my users are seeing "Something went wrong, our engineers are looking at it" instead of my nice branded "404 - Not Found" page.

  You try employee 1, fantastic, it works!
  You try employee 100, not fantastic, it 404’d.
  Huh?
  Why do I get a 404 here? The path is clearly correct, otherwise employee 
  1 wouldn’t have worked either.
  “Ah”, you may be thinking “but it clearly means that the employee wasn’t found!”
  No, there’s nothing clear about that. If I were to call /api/v11/employees/1 
  I would get the exact same error. As an API consumer, all I want to do here 
  is raise my middle finger.
  But as an API producer, this results in a conundrum: What am I supposed to do then?
To start off with, stop worrying about your clients fat-fingering your API namespace. That is not your concern. You don't give a shit if they spend a _week_ hammering the wrong domain and getting 404s, why the hell would you care about them hitting a non-existent v11? You don't do anything beyond direct this confused user to your docs.

If for _some reason_ you want to give more info for requests to `/api/v11/*` or whatever other non-paths you want to handle, just serve a 400 response back and let your consumers figure out what they screwed up; but I argue you have more important things to do with your time.

Also, why are you worried about differentiating your server errors from client network failure? That's for the client app developer to handle. Don't worry about it. Your obligation begins and ends with a connection to your service.

  Opionated payloads should be mandatory
  Returning a 2xx code immediately tells the client that the HTTP response
  contains a payload that they can parse to determine the outcome of the 
  business/domain request. That is to say
    - client checks HTTP response is valid (2xx status)
    - client can confidently parse the response and make a domain oriented
      decision, as opposed to a techinical one
  This makes your client happy. Very, very happy. Using our above examples, 
  here is what we would see:
If I was reviewing an API for an integration and I ran across this blog post and/or descriptions of this behavior in the API docs, your service would go straight into the "won't integrate with" pile. I'm simply not interested in the problems this paradigm will generate for us. I've gone down this road many times, these days it's use the HTTP Spec or GTFO.

  My API is clean, easy to understand and easy to debug. A client no longer
  needs to send me a request to ask for clarity on an endpoint that sometimes
  returns a 200 and other times returns a 404.
Ah, there's the nut. Instead of adding descriptive error bodies to your 404 responses you threw the paradigm out the window and added descriptive error bodies to 200 Success responses. If you're not providing API packages for your users to hide this unexpected behavior, they are not "very, very happy". No one is "very, very happy" as they add yet more Magic Strings with which to infer what their remote resource means when it says "200 Success: Failure"