Rate limits are of course fine. I think token level limits make more sense, because application level limits force the consumer to either to track a rate window across asynchronous processes or make the calls synchronously. But, I mean, that's fine too. Think it's just like OAuth1a, fine in itself but add enough of these things on top of one another, and you've created a technical hurdle that's just too difficult to leap.
I would think a user token level limit would prevent this, although I could imagine a case where a bug simultaneous affected all user tokens, but I'd imagine you'd set that app level limit pretty high because otherwise you'd be making life very difficult for legitimate use cases.