I’m stuck trying to figure out if this is technically desired behavior or not. If you were retroactively designing http/2 with this knowledge, would you have done anything different?
Systems that require very high cognitive load on their human operators (whether machines, programming languages, etc.) are alway destined to fail. Human beings are not good at doing boring, repetitious work that requires them to stay focused: lapses will occur. And hackers are going to find and exploit those gaps. So the best way to avoid those problems is to build solutions or specifications in which those gaps are not even possible.
Modern software systems are some of the most complex systems that humans have ever invented--and they just keep getting more complex over time as we layer new things on top of them. Think of a really high Jenga tower with lots of holes in the base.
That means we need to strive to keep things simple. That may mean making decisions that prevent common or severely impacting failure cases from being possible, even if it is a little more difficult to do, or eliminates an esoteric use-case. This is even more true when writing specifications that others will implement and/or be expected to conform to.
I guess the parent post want to know if there are any _specific_ and _effective_ change for this kind of attacks.
In this case, simplifying the protocol won't help -- the vulnerability is:
1. Backend servers can't cancel immediately (this is no protocol problem)
2. The client can make concurrent request in a connection (This is the goal of Http/2)
3. The concurrency is pre determined, there is no way for the server to throttle without user-visible error.
4. The client can cancel any request mid-flight (removing this is equally bad, security-wise)
Unless you are removing the concurrency, making the protocol simpler won't fix it.
The protocol designer need adversary mindset, not a simpler mind
If I understand it correctly, a big part of the problem is that 1) requests which are in the process of being cancelled are not counted towards the concurrency limit, and 2) you can create and cancel a request in the same package.
1 allows you to have more pending requests than intended, making some form of DDoS possible. 2 allows it to trivially scale to hundreds of requests per packet rather than just the pending-stream limit, limited only by the packet size.
Disallowing 2 should be fairly trivial, as there is no valid reason to cancel a request in the same packet you started it. I'd consider it more of an implementation bug than a protocol problem.
Issue 1 is definitely a protocol problem though, and it's going to be a bit trickier to fix as it would require nontrivial changes to the request state machine. A fix would require subtracting a request from the pending-stream count not when it is cancelled, but when its resources have been fully cleaned up - and ideally you'd even add some sort of throttling on that to make it even harder to abuse.
Systems that require very high cognitive load on their human operators (whether machines, programming languages, etc.) are alway destined to fail. Human beings are not good at doing boring, repetitious work that requires them to stay focused: lapses will occur. And hackers are going to find and exploit those gaps. So the best way to avoid those problems is to build solutions or specifications in which those gaps are not even possible.
Modern software systems are some of the most complex systems that humans have ever invented--and they just keep getting more complex over time as we layer new things on top of them. Think of a really high Jenga tower with lots of holes in the base.
That means we need to strive to keep things simple. That may mean making decisions that prevent common or severely impacting failure cases from being possible, even if it is a little more difficult to do, or eliminates an esoteric use-case. This is even more true when writing specifications that others will implement and/or be expected to conform to.