Why does there need to be a paid API, when reddit provides all of the content over a free HTTP API to user agents already? Are these apps more than just a custom user agent?
There are two Reddit APIs - the public REST API and a private GraphQL API which limits access. Third-party apps use the REST API and the Reddit website/app use GQL.
For a hobby project you could maybe get away with scraping a GQL bearer token and issuing requests as if you were an official Reddit client. Or you could even request the HTML and scrape that. At the scale of these third party apps that approach just wouldn't work.
And let me just say, reverse engineering the GQL API is a massive pain. From my tests they verify a ton of things or else you get rate limited or "trusted less" - order of headers, formatting of JSON being posted, someone even said they check the TLS handshake signatures. GQL specifically has become a massive burden for me working on Libreddit, and I probably am going to give up on GraphQL and just have instances provide their own API token. The good news is there might be ways to auto-retrieve these tokens. We'll have to see in 17 days.
> At the scale of these third party apps that approach just wouldn't work.
Has there been any App Store apps that go with this approach?
As long as Reddit provides some API accessible to a non-logged in user on the web, there’s going to be a way to scrape it. If you push that scraping into the start of the app then it’s be distributed without any clear way of blocking it.
You could even have the app fetch “how to” updates from a a central site rather than pushing app updates so you don’t have to wait for App Store approvals to get around scraping updates.
We could call the end user’s program for accessing the site a “user agent” as it acts on behalf of the user to fetch and display the content that the user wants to see, in the manner the user wants to see it.
They will now. Before people were just polite to use Reddit's API. I'm sure there will be plenty of 3p on Android that scrape that are a work of love off GitHub soon.
It would be cool if the big 3p app developers would make their apps point to a configurable different API location, then people could set up their own docker scraper instances to provide the API for their app.
It could be an open-ish format, then you could potentially support alternate sites like HN as well.
Unfortunately, the App Store (and Play Store) rules ban this sort of thing — any access to third party web services needs to be in line with their terms of service.
That really sucks. Maybe what we need is something more general, a web browser that doesn't enforce CORS / XSS / CSP. So you could frame up an entirely new UI on any site you'd like that.
> Or you could even request the HTML and scrape that
That would be such a nightmare between constantly changing UIs, A/B testing, and the fact that new reddit is a broken mess even when running in a normal browser where it's made to run.
I think he was trying to be clever by referring to the standard web interface as the free HTTP API, which is true to a limited extent. It's a poorly-designed and unstable API. The use of the unpopular term "user agent" may have made it harder to understand, but I appreciate almost any opportunity to bring that term back, because it has important implications that "web browser" lacks.
The straight answer is that ads are embedded in the "HTTP API" and the vast majority of browser users don't have adblock, and Reddit can't force ads on any API users (except through the official app).
The exact same point applies to Twitter's 3rdparty API getting killed off, and both moves are still a mystery.
It'd be easy to say "as a condition of getting this API key, you agree to display ad elements as they are served in the feed, and on click, open their associated URL in the system browser". All the ad-targeting is done server-side anyways, and attribution via unique links is easy.
Why not though? Seems really straightforward to serve ads over the API and enforce any display guidelines on third-party apps, since there are only a handful of significant apps anyway.
It would require some very careful system design and lots of trust of the third-party apps.
(It's been a while since I was in the middle of this, so stuff may have changed a bit. But in general...)
Ads are usually not served up by the application provider (Reddit). Instead, they embed URLs given to them by their customers (ad agencies). If the app is browser-based they'll wrap them in some javascript that also calls app provider endpoints and manages clicks on the ads. They do this for a few reasons: 1) There's a huge amount of overhead to the app provider if they try to manage and serve the customer's ad creatives. 2) There's a lot of hassle for the customer; they have to go through the app provider to make any changes to the ad media. 3) Nobody trusts anybody else; this way the customer knows exactly how many times their ad was shown (and if video possibly how long it rolled), and the app provider still knows how many times the ad was displayed vs. just offered to the end user's device, and what the click-throughs were.
The app provider could pass the customer URLs and the provider's wrapping endpoints to third-party apps. But they'd need to think good and hard about all possible fraud games, and would need to trust the third-party app to perform the complex dance properly. Examples:
1) what if the third-party messes up and doesn't call the click-through endpoint? Or sometimes does? Or calls it when they shouldn't? Click-through accounting is a huge deal with very large financial ramifications.
2) How do you enforce that the proper ads are shown in the proper context? If you control the app then you can sell above-the-fold vs. below-the-fold spots etc.
3) How do you control that the ads are actually shown when they should be? Not every link given will result in an impression (below-the-fold again).
4) Even if you completely trust the third-party app's motives, how do you monitor and debug the end-to-end flow?
5) How do you convince your advertisers that's everything is under control? A customer is probably going to have fewer warm-and-fuzzies with third-party impressions, and very well might discount their value.
> Ads are usually not served up by the application provider (Reddit) ...
This is the case for most sites, but not Reddit. Reddit rolled their own system, and if you look at their ads you'll see it's all coming through Reddit.
For a hobby project you could maybe get away with scraping a GQL bearer token and issuing requests as if you were an official Reddit client. Or you could even request the HTML and scrape that. At the scale of these third party apps that approach just wouldn't work.