| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by terom 1303 days ago

Disregarding the emphasis on XML -> replaced with JSON, isn't this actually mostly spot-on re the popularity of Single-Page Apps (SPA)? There is very little if any HTML in the form of the text-based markup involved in a modern SPA - the actual display logic is all DOM manipulation in JavaScript, operating on separate APIs providing just the raw data in JSON form.

> Truly powerful applications can be built using combinations of JavaScript and XML. Not only can the data and its format (XML) be shipped to another system, but the associated processing logic to validate data entered into the record or form can be shipped along as JavaScript as well.

The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems. Think in terms of modern banking systems that are based on screen-scraping terminal UIs of that era.

5 comments

lucideer 1303 days ago

> isn't this actually mostly spot-on

Only if you accept the initial premise, which is nonsense. The article is based on the idea of HTML as an interchange format, something that's only the case in dysfunctional situations (scraping data illegally, or a horrific breakdown in communication/collaboration between two business entities).

Sure, there was a for a time a big focus on XHTML as a "hybrid" format - interweaving Microformats/RDF & other machine-readable metadata into display documents to give them a dual purpose, but even with that, the primary purpose was always human display, not machine-readability.

HTML isn't designed as, nor primarily intended to be, a machine interchange format. And espousing such hyperbole as "HTML is a strategic dead end" based on it not meeting a use-case for which it was never designed, is harmful.

> The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems.

This is a perfect summary. Would it be nice if more HTML websites were more machine-readable - sure. For me, a hacker, it would make life nicer. But should it be a pre-requisite for business transactions & e-commerce to function - absolutely not.

link

HillRat 1303 days ago

This article predates XHTML, and was almost certainly not informed by the then-current work on RDF/Dublin Core. At the time, terminal screen-scraping was ... not an uncommon method of data interchange between systems (hence the context of 3270 terminals); using BMS maps was an improvement over the naive approach, basically letting you hook into the screen formatter. The article's point is that these approaches were legacy baggage, and the attempt to "modernize" BMS maps by letting them output HTML in addition to green-screen was doomed to fail. Instead, Duquaine advocates where we landed, with SOAP (and successor) services making data available instead of forcing integrations to go through human-readable display functions. (It's probably worth noting that this is what he focused on at Sybase, specifically his work on their RPC gateway that hooked into legacy mainframe transports such as CICS.)

link

Sacho 1303 days ago

This article was written during a time where the idea of a semantic web was still bright and strong. Scrapable HTML websites would have been at the forefront of interchange ideas then.

link

DonHopkins 1303 days ago

He didn't realize that gardens need walls, and HTML is the perfect building block for walls.

link

commandlinefan 1303 days ago

> the initial premise, which is nonsense.

The initial premise is: "HTML is a strategic dead end _for business transactions and e-commerce_". That premise is absolutely spot-on.

link

lucideer 1303 days ago

> That premise is absolutely spot-on.

So... if you feel like it, we can split hairs and say that HTML is a dead end as a data interchange format, in the same way that orange juice is a dead end as motor fuel. That's not what's really being discussed here though.

The pertinent quote in the article is:

> HTML is ultimately as strategically dead as 3270 is. HTML suffers from the same ultimate fatal weaknesses that doomed 3270

The implication is that HTML is a dead end in general because it doesn't act as an interchange format, not that it's specifically & narrowly a dead end within that use-case.

As for "business transactions & ecommerce", that's a vaguer phrase. If you mean using HTML to exchang transaction data between business application APIs, then of course it's not appropriate. If you mean using HTML to provide human interfaces to ecommerce & business transactions, then that's a different debate (not at all touched on by this article).

link

naasking 1303 days ago

> The article is based on the idea of HTML as an interchange format, something that's only the case in dysfunctional situations (scraping data illegally, or a horrific breakdown in communication/collaboration between two business entities).

Think again:

https://schema.org/docs/gs.html

link

lucideer 1303 days ago

> Think again

I mentioned microformats & RDF in my comment...

link

hodgesrm 1303 days ago

> The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems.

Agree. Many of the comments seem to come from reading "Why HTML Is a Strategic Dead End" leaving off "for Business Transactions and E-Commerce." Screen scraping 3270 protocol blocks was a thing for integrating IBM applications (and not a good thing IMO).

I don't think Wayne was really focused on behavior outside of using HTML for integration purposes. This is partly from reading the article. It's also from knowing Wayne Duquaine, the author. We met in the mid-1990s at Sybase, where he made all the IBM system integrations work. Wayne viewed most problems through a mainframe lens. His note seems pretty typical of his point of view.

link

lucideer 1303 days ago

This adds great context to the article - thanks.

> I don't think Wayne was really focused on behavior outside of using HTML for integration purposes.

This makes me more curious about his work now - I'll have to look into it more. It sounds like he was needing to do something he really shouldn't've needed to do, but perhaps I'm (again) missing context.

link

miohtama 1303 days ago

> The main thrust in this article is that scraping HTML display markup is a terrible form of data interchange between systems. Think in terms of modern banking systems that are based on screen-scraping terminal UIs of that era.

As a sidenote this is intentional for banks. They do not want to provide APIs or anything similar to ensure they can lock in their customers to their own platform. This is because competition is toxic for companies that do not invest in R&D. Sometimes user hostile banks do not even provide CSV downloads for your tranasctions.

The situation is so bad that the EU rolled out regulation called PSD2 to address the situation. You need to get access to your bank account via API by law. However, the bureacrats made this extra political and complicated and you need to sign up to a third party company and request the access through this company to your own bank account. The solution, PSD2, has not been the great success that takes down Visa/Mastercard/etc. as its proponents claimed.

link

hinkley 1303 days ago

JSON didn’t exist when this article was written. Crockford did his prototype the following year.

The history of science and industry includes a bunch of people seeing the same set of problems and answering them in their own way. The scene is set, the tools are there, someone just needs to articulate it and offer a solution.

link

wolfprogramming 1303 days ago

SPAs don't solve data exchange problems. Json and xml API's do.

You can have a templated html multipage site that still has a Json or xml api for other UI's that need it like android or ios. SPAs just try to morph html into some kinda more responsive dynamic app at the cost of complexity and initial load times.

link