| > isn't this actually mostly spot-on Only if you accept the initial premise, which is nonsense. The article is based on the idea of HTML as an interchange format, something that's only the case in dysfunctional situations (scraping data illegally, or a horrific breakdown in communication/collaboration between two business entities). Sure, there was a for a time a big focus on XHTML as a "hybrid" format - interweaving Microformats/RDF & other machine-readable metadata into display documents to give them a dual purpose, but even with that, the primary purpose was always human display, not machine-readability. HTML isn't designed as, nor primarily intended to be, a machine interchange format. And espousing such hyperbole as "HTML is a strategic dead end" based on it not meeting a use-case for which it was never designed, is harmful. > The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems. This is a perfect summary. Would it be nice if more HTML websites were more machine-readable - sure. For me, a hacker, it would make life nicer. But should it be a pre-requisite for business transactions & e-commerce to function - absolutely not. |