That's pretty much what I was thinking too. "A way to create an image, with parts that can be edited using tags, that an LLM can deal with... this sounds like SVG"
In case you missed it—I’m not showing regular SVG, I’m showing using HTML as an image, via an SVG container. This has worked fine in browsers for more than ten years (around twenty, other than IE/Edge).