Hacker News new | ask | show | jobs
by beachy 876 days ago
If the provider is bearing the costs (like here) then they always need some kind of authorization, or they have no way to shut off abusers or people with misbehaving clients.

An API key is about the simplest possible way to achieve that, and appears to be perfectly adequate in this case.

What do you suggest? SAML?

1 comments

> If the provider is bearing the costs (like here) then they always need some kind of authorization, or they have no way to shut off abusers or people with misbehaving clients.

HTTP is an "API" that has no API keys and all the public web servers in the world seem to manage this without any trouble.

> What do you suggest? SAML?

No authentication required by default -- it's public data. Just impose a reasonable rate limit by IP address and require registration only if someone has a legitimate reason to exceed that.

> all the public web servers in the world seem to manage this without any trouble

Incorrect. Most large web sites invest in DDOS protection e.g. Cloudflare.

Cloudflare DDOS protection as an example is a lot more sophisticated than merely counting requests per source IP (https://developers.cloudflare.com/ddos-protection/about/how-...).

Cloudflare is one of the ways they manage it.

But API keys aren't any good for that anyway because if someone is just trying to overload your service by brute force, they can send requests regardless of whether the keys are valid and still use up all your bandwidth sending error responses or your CPU/memory opening new connections prior to validating the API keys, and to avoid that you'd still need some kind of DDoS protection.

Where they actually do something is where you're doing accounting, because then if someone wants to send you a million requests, you don't block them, you just process them and send them a bill. Maybe you block them if they reach the point you don't expect them to be able to pay. But if it's a free service that anybody can sign up for as many times as they want then that doesn't do any good because the price is $0 and a rate limit per key is avoided by signing up for arbitrarily many more keys.

> HTTP is an "API" that has no API keys and all the public web servers in the world seem to manage this without any trouble.

Um, no. That’s just not true.

We're currently using a discussion forum that nobody signed up for an API key in order to make posts and you don't even need a user account in order to read. What allows them to sustain this without being destroyed by evil forces?
> nobody signed up for an API key in order to make posts

Yes you did. When you logged in, they gave you an API key in the form of a cookie that you include with every request.

And it's run at a loss by Y Combinator, which is very, very wealthy. And even hackernews has to pay for cloudflare and mods, on top of hardware, hosting, and traffic.

> When you logged in, they gave you an API key in the form of a cookie that you include with every request.

You can read this website (i.e. make queries against its database) without logging in. Moreover, the main thing the cookie does is not some kind of rate limiting or denial of service protection, it's assigning your username to your posts so that others can't impersonate your account. Various image boards exist that even allow you to post without logging in and they seem to be fine with it.

> You can read this website (i.e. make queries against its database) without logging in

Yeah, but the sentence I replied to was "nobody signed up for an API key in order to make posts". That claim was false. Being able to read the website is a totally different topic.

There's a rate limiter that kicks in if you try to post or do other things as a logged in user too fast.
Probably either the lack of evil forces currently attempting to destroy it or cloudflare.
So we've established that it isn't API keys.
Per IP limits don't do anything about the scenario where the API is integrated into a third party website that sees a sudden spike in popularity. At that point, the API is providing free capacity to the third party site. Maybe that is fine, but you seem to be ignoring the possibility.
Because it's fine. That's what it's for, isn't it? The public, via some website, is requesting the government data their tax dollars have paid for.

Which allows that website (or app) to operate with minimal resources, e.g. by a non-profit or open source project, instead of having to be a for-profit entity which needs some underhanded way to generate revenue in order to display the "free" data.