Hacker News new | ask | show | jobs
by David_Axelrod 1102 days ago
Most of the time you dont just expose your internal api routes to the world. You need to write curated public facing routes that dont include certain schema or records. Takes time to write and maintain that different set of endpoints
2 comments

Not really. Your HTML is, quite literally, a degenerate form of an API. The simplest way to offer a content API is to offer an alternative endpoint that serves the same stuff as your normal one, except without all the bullshit (er, beautifully design, interactive frontend).
Any issue with Reddit offering a API only to not maintain it, or uphold any sort of SLA uptime, or not worry about releasing breaking changes every week?

Having a public facing API is not trivial or cost free.

> Reddit offering a API only to not maintain it

If they were to offer an API that's just HTML of the website (old.reddit.com specifically, not the new one) but without the cruft that makes for 90% of the markup of a human-facing page, and which exists only to hang styles and scripts off... why wouldn't they maintain it? It's literally the same as what the browser gets, but without the bullshit.

> or uphold any sort of SLA uptime

Do they uphold any sort of SLA uptime for the webpage itself?

The simplest API would be just the meat of the website, so it couldn't possibly be less reliable than the site itself.

> not worry about releasing breaking changes every week?

Reddit is a stable site. Like most social media platforms, they don't release breaking changes often (they do screw with DOM element ids and CSS classes all the time, but that is to make life harder for ad blockers, which is another topic). Sure, some things may move around, be added or removed - but this is webshit we're talking about. You can't truly rely on any API to have a stable, or well-defined structure[0] - so people are already used to treating schemas as open-ended[1] and keeping up with their changes.

> Having a public facing API is not trivial or cost free.

Sure. But I'm trying to establish a lower bound here, and it's clear that this is much lower costs and effort than maintaining the human-facing website itself. And I mean, remember the whole "semantic HTML" and "microformats" trends of yore? Or how HTML5 came to be, with all those tags like <em> and <section> and <article>? The whole point of that was to make HTML work as both rendering markup and machine-readable API.

Consider also that the alternative isn't no public API - it's scraping. So if your public API is somehow more expensive to serve or maintain than either the website itself, or a decluttered version of it, then you're doing something wrong.

--

[0] - Don't get me started on the disaster that is Swagger/OpenAPI.

[1] - Something Clojure coding philosophy makes explicit: you pass around maps and arrays, you read and write the keys you know about, and stuff you don't recognize you ignore and pass without changing.

Don't mind the other commentor, spirited discussion is always welcome! I was thinking about this and even hacker news is a good example of how an html view can differ from data model. Hacker news doesn't show you the vote count of every comment despite having that data available. They chose to not even render it into the template so no scraper will ever have access to that information.
> Your HTML is, quite literally, a degenerate form of an API.

i look forward to the folks making this argument telling their boss “it’s ok the HTML is an API that’s all we need to provide”

come on, nobody believes this when they’re not trying to win an argument on the internet

Look, if you have a webpage, you're already providing it. If your webpage is of any use to anybody, someone is likely consuming it by scrapping. So if your dedicated API is harder and/or more expensive to provide and maintain than the degenerate API of your webpage's HTML, you're doing something wrong.

I'm not trying to win an argument. I'm trying to point to an obvious reference point for cost/effort behind an API that's handling the same data and interactions the webpage does.

These are already written and fully functional, and Reddit has essentially been a non-moving target in terms of features since ~2016, when they added first-party image hosting.
Reddit actually has added a couple of features since then (e.g. polls), but they just didn't update the API to deal with those, so the API still remained completely stable.