Hacker News new | ask | show | jobs
by gwerbin 69 days ago
Like it or not I think "agents browsing the web" is the inevitable near-term future. Some agents will be malicious, many will not. In 2036, HN posters will be complaining about how such-and-such site only works with closed proprietary AI agents, and how their creaky old Mac M5 running Gemma 3 under Ollama can't browse the site properly because it doesn't follow the 2029 RFC XYZ for agent compatibility that nobody ever fully implemented.
3 comments

Sure, lets say I eat up all of that and agree with you: How does this website help/not help? Agents already read HTML perfectly fine, saying "Well, you don't serve markdown so this obviously is bad for agents, you're only serving HTML" doesn't really feel like it's contributing anything either in protecting against malicious agents, or how the website only work for some agents but not others.
I'm also not advocating for or against any particular proposal. Maybe the right solution is that agents should have a client-side "reader mode" tool, who knows. What seems inevitable is that people will be using LLM-based agent-things more and more frequently, and there will be some demand for sites to work with them. It might even just come down to providing RSS feeds and public HTTP APIs. Who knows, it's a brave new world.
I'm going to try to figure out how to make my websites as easy as possible to peruse for humans while making it as hard as possible to do the same for agents. There should be some way make the bots pay a price of admission while keeping it free for people.
This still doesn't really answer my question, though. This is like telling me my old blog posts can't be parsed by your regex.

Like... yeah, no shit; I didn't build it for your regex. It's not the target audience.

Plus, isn't the appeal of LLMs broadly that they can do somewhat-useful things with mostly-arbitrary input (if you ignore the risk of prompt injection)?

> Plus, isn't the appeal of LLMs broadly that they can do somewhat-useful things with mostly-arbitrary input (if you ignore the risk of prompt injection)?

They can definitely read HTML, but they do better with more structure. I proposed in a sibling comment for example that the "reader mode" feature in browsers might be a great LLM-compatibility feature to reduce all the HTML token noise. Or exposing an HTTP API with an OpenAPI schema and a proper sitemap and an RSS feed. For example fetching from an RSS feed can be exposed to the LLM as a "tool" that it can call.

I don't think it's fair to say that HTML's less structured than Markdown. Markdown is derived from a simplified subset of HTML, and having myself cut my teeth on HTML5 when it was still new, there's been a huge emphasis on the idea of the semantic web conveyed through HTML.