Hacker News new | ask | show | jobs
by matheusmoreira 3013 days ago
The page's HTML is the API. It's pretty easy to download a web page, parse the HTML and then extract specific bits of information from it. The browser does the same thing on the user's behalf, which is why it is called the user agent.
1 comments

An API is a contract. HTML can be tweaked and become incompatible with your parser at the developer's whim.
Oh luckily major APIs never change. /s
Not as easily as an HTML page
That just means your code must be maintained. You can verify that the HTML has a given structure and log a failure if it doesn't.
Use Deep Learning to circumvent that.