Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 2054 days ago
This story clearly illustrates the purpose behind "web APIs". To limit access.

As a user (not a web developer), I personally never saw the practical point of web APIs; I have always just "scraped the HTML". Many times the solutions I write outlive the corresponding "API"; IME, often the non-API method of data retrieval is more robust and reliable than using the so-called API.

YouTube used to have a freely accessible search API. Not anymore. However "scraping" the YT search result pages continues to work fine.

2 comments

The purpose of APIs it to provide a “uniform” interface. HTML layouts can change. JavaScript could be added to download the images after page load. An API shouldn’t change as often. And if the API is “versioned”, you can usually use the old version (old HTML layout) for a while before you upgrade (compared to your tool breaking as soon as the HTML changes).
The purpose of an API is like a company mission statement: There's one version written on the wall and then there's the actual version everybody knows is true but they don't say it out loud.

You described the written one above. GP described the actual one.

That is the promise of APIs, but there is no guarantee.
In theory, yes. In practice, no.
> YouTube used to have a freely accessible search API.

Twitter used to have RSS/Atom feeds for each account so you could follow someone without a client, just a regular old news aggregator.

AFAIK they still do, just without any way to find it except by digging into the channel page’s HTML to find the channel_id and then constructing the feed URL from that (“https://www.youtube.com/feeds/videos.xml?channel_id=$channel...) — or (edit) using something like https://github.com/rss-bridge/rss-bridge that presumably does something like that under the hood — so I guess scraping for an undocumented API.
I have used that /feeds/videos.xml URL for retrieving a list of all videos in a playlist (playlist_id), but I have not used it for channels. What is the maximum number of results that one can retrieve. Without any additional parameters, it looks like it only returns 15 videos by default.

Personally, for channels, I use a script only needs to access the channel's page; it outputs a list of all the videos in the channel. One of the more recent web development trends I dislike are sites that "load more results" using additional Javascript-triggered HTTP requests in response to scrolling a page. YouTube channels with multiple pages of videos are one example.

With custom scripts I wrote for searching YouTube, outputting lists of videos from channels, and downloading non-commercial videos, I can use YouTube without the need a graphical browser.

You are talking about Youtube: I am talking about Twitter.
Oops, my bad. https://nitter.net/ is a nice alternative Twitter frontend with RSS feeds btw.
https://aur.archlinux.org/packages/nitter-git/

It's even on AUR, for easy usage.