You assume that the thread pool which takes requests is the very same one that gets bogged down. It might very well be something down the line in the processing pipeline, and unless you poll these stages or they signal their capacity through the pipeline, you can't know that.
You're wasting resources when your application is already in a state where it knows it won't be able to handle the request. Eventually the memory taken by the partially processed requests is going to exceed what you can take in (unless you cap the number of concurrently processed requests, which is also an inelastic backpressure of sorts) and the service will crash.
What you mentioned is decent for inelastic blocking synchronous processing (you can have at most X concurrent requests, because that's how many threads for processing you've configured based on performance tests and production monitoring), but you can relatively easy fill in an internal queue somewhere if it's async.
It can be unacceptable sometimes and other times just wasteful to block a whole host OS thread just because a read or a write on one socket out of 1000s is not ready.
We solve this problem with APIs for asynchronous or nonblocking IO.
But such APIs must be cleverly designed if they are to permit you to propagate backpressure from the downstream end of your program's dataflow to the upstream end. And handle errors in a sane way, etc.
You're wasting resources when your application is already in a state where it knows it won't be able to handle the request. Eventually the memory taken by the partially processed requests is going to exceed what you can take in (unless you cap the number of concurrently processed requests, which is also an inelastic backpressure of sorts) and the service will crash.
What you mentioned is decent for inelastic blocking synchronous processing (you can have at most X concurrent requests, because that's how many threads for processing you've configured based on performance tests and production monitoring), but you can relatively easy fill in an internal queue somewhere if it's async.