Hacker News new | ask | show | jobs
by nlarew 1107 days ago
There are two Reddit APIs - the public REST API and a private GraphQL API which limits access. Third-party apps use the REST API and the Reddit website/app use GQL.

For a hobby project you could maybe get away with scraping a GQL bearer token and issuing requests as if you were an official Reddit client. Or you could even request the HTML and scrape that. At the scale of these third party apps that approach just wouldn't work.

4 comments

And let me just say, reverse engineering the GQL API is a massive pain. From my tests they verify a ton of things or else you get rate limited or "trusted less" - order of headers, formatting of JSON being posted, someone even said they check the TLS handshake signatures. GQL specifically has become a massive burden for me working on Libreddit, and I probably am going to give up on GraphQL and just have instances provide their own API token. The good news is there might be ways to auto-retrieve these tokens. We'll have to see in 17 days.
> At the scale of these third party apps that approach just wouldn't work.

Has there been any App Store apps that go with this approach?

As long as Reddit provides some API accessible to a non-logged in user on the web, there’s going to be a way to scrape it. If you push that scraping into the start of the app then it’s be distributed without any clear way of blocking it.

You could even have the app fetch “how to” updates from a a central site rather than pushing app updates so you don’t have to wait for App Store approvals to get around scraping updates.

We could call the end user’s program for accessing the site a “user agent” as it acts on behalf of the user to fetch and display the content that the user wants to see, in the manner the user wants to see it.

They will now. Before people were just polite to use Reddit's API. I'm sure there will be plenty of 3p on Android that scrape that are a work of love off GitHub soon.
It would be cool if the big 3p app developers would make their apps point to a configurable different API location, then people could set up their own docker scraper instances to provide the API for their app.

It could be an open-ish format, then you could potentially support alternate sites like HN as well.

Unfortunately, the App Store (and Play Store) rules ban this sort of thing — any access to third party web services needs to be in line with their terms of service.
That really sucks. Maybe what we need is something more general, a web browser that doesn't enforce CORS / XSS / CSP. So you could frame up an entirely new UI on any site you'd like that.
Sounds like you're pretty much describing the original intent of HTML.
> Or you could even request the HTML and scrape that

That would be such a nightmare between constantly changing UIs, A/B testing, and the fact that new reddit is a broken mess even when running in a normal browser where it's made to run.

Too bad Reddit! No API means scraping at scale! That's fine with me too.