| Zooming out a little, the fundamental problem here is broadly recognized as a modern protocol design challenge. To phrase the consideration roughly: the response to a request should not require more resources than the client has already spent to request it, either in terms of bandwidth or processing (including memory, storage IO bandwidth, etc.). Obviously, in some cases such design is not possible. The classic case is HTTP, where the entire purpose is to supply some (arbitrarily large) volume of data in response to a small request, and therefore there is a bandwidth challenge. Conventional defense strategies tend to utilize the fact that TCP requires a three-way handshake to instantiate, thus validating the peer's IP address (unlike UDP), and include: (1) An authenticated session, eg. using a session key derived from a separate API call. (2) Rate limiting per authenticated user, either based upon data over time or request frequency. (This alone is the subject of the article) (3) Segregating read-only, cacheable data (even if it expires within seconds) on separate infrastructure such as CDNs or memory-based caches. (4) Aggressive use of HTTP caching. (5) Careful tuning of HTTP session length related configuration to suit the application profile. A newer strategy is the use of captcha, however this is not viable in automated (ie. API) use cases. Another relatively 'new' (for HTTP) strategy is the use of websockets or other mechanisms for real time 'push', to avoid the latency and processing overheads of the conventionally enforced HTTP request-response model. Additional options would include segregating clients to speak to different servers (ideally in different data centers, on different links) such that overhead may be scaled across different infrastructure. Thus even if a single server is targeted by an attacker damage is limited in scope. Another architectural defense would be the enforcement of a gateway/proxy layer (internal or third party) obscuring real datacenter netblocks from attackers, however this comes at the cost of latency where data cannot be cached. Cloudflare basically provide all of the above (plus additional features) as a service. Finally, in native mobile application development where an API is the primary interface with some central system, another simple step that can be taken (with appropriate care regarding peer authentication and cache invalidation) is the use of a cached set of IP addresses within the client as a means to identify servers. In this way, attacks against DNS infrastructure will also be nullified, and you can focus the demands of segments of your user base on different infrastructure. (Here in China, DNS is often hijacked or broken, though this is much less of a concern in typical western environments. It will also reduce startup latency on slow mobile links the world over.) |