Hacker News new | ask | show | jobs
by hacker_9 2818 days ago
You just need to look at the packets in Fiddler as the page loads, find the request that gets the data you want, and then clone that request in your application.

I just took a look at soundcloud and was able to get the data within a minute, as it's a basic setup. They use a json structure (most websites do) [1], and the data is fetched with a simply GET request.

If the website requires authentication, you just need to clone those requests too, and then you'll get some sort of cookie/session id back which you attach to any future requests.

As a bonus: You can then throw the json into a code converter online too which will convert your structure to a Type as well (useful if you are using a static language).

[1] https://imgur.com/a/Einoxdg

1 comments

This is so 2010...websites now use a mix of serverside and client side. You can't just watch http requests and figure out the "api" specifications. Just try to scarpe adwords/gmail/amazon etc. Consider that they may also use anti-scarping code on the client.