interesting related aside: I'm comparing HTML/hypermedia w/MCP as an agentic protocol and adding accessibility information made using HTML-based APIs much easier some agents
right now the primary problem for hypermedia in agentic situations is the chattiness of the architecture, coupled with the geometrically expanding conversation dynamic of ReAct-style loops
some models are able to figure out hypermedia-based APIs more easily than MCP, which is very particular in its syntax, but for more advanced models MCP wins based on the "show me everything at once" model
That makes sense. That “show everything at once” approach probably reduces some of the back-and-forth that hypermedia workflows rely on.
It’s interesting that some models can infer structure from hypermedia more easily. That seems like another place where semantic structure ends up helping both humans and machines interpret an interface. NICE!
One thing HTML has going for it is that accessibility info (semantics, ARIA roles, structure, etc.) is embedded.
Are you finding that agents can make use of that directly, or are you adding more accessibility metadata on top?