Hacker News new | ask | show | jobs
Wrong on the Internet (tbray.org)
14 points by ksetyadi 5314 days ago
3 comments

JSON errors are worked around by developers, we don't display an error and kill everything if it doesn't format right, plus it's a lot simpler.

XHTML/XML errors are displayed to the user, and the idea of "being strict" is fine for development, bad for release.

XML it makes sense, since it's a data format, not a document format. XHTML is not a data format, it's displayed to the user. Which is bad.

Lastly, WTF ABC? Really, guys, serving XHTML based on UA? What the hell went through your heads that made that seem like a good idea?

  > XML it makes sense, since it's a data format, not a
  > document format. XHTML is not a data format, it's
  > displayed to the user. Which is bad.
All documents are data. All document formats are data formats.

Most formats use strict error handling; SGML's lenient handling is actually very rare. If you removed random sections from a Word document, or a PDF, or en eBook, that document would then be un-renderable. I have never heard it claimed that .pdf is unsuitable for display to users because it is not resistant to arbitrary corruption, and it's silly to make this claim for HTML/XHTML.

20 years ago, HTML was written by hand, in plain text editors. There was absolutely no validation performed between the author's keyboard and the user's web browser. This model is no longer practical for modern web sites, but some developers have refused to alter their behavior.

XHTML requires authors to perform a minimum of validation before sending markup to the browser. The holdouts are used to just writing up a website and clicking "save", so they complain bitterly. But anybody who uses a template engine will not care, because they are used to the idea that what they write is not identical to what the browser receives. To these developers, supporting XHTML is a simple matter of changing a setting in their template library.

First of all, documents are not data. If you think it's data, you're sorely mistaken. HTML was designed as a way of MARKING UP text, not for making a DOM tree.

The DOM is simply a way of representing HTML for scripting access and whatnot.

Second, it is not to be resistant to corruption, it's to be resistant to human error. Also, many formats ARE resistant to that, thanks to various error correction algorithms.

Have you ever heard the story behind kernel panics? Errors should not be shown to end users when they are recoverable. Errors happen every second, if we let them all out, computers would be unusable.

Thirdly, that "XHTML is easy" assumes that everybody has embraced HAML or Jade or other sugary templates. I personally find they all suck horribly, so I use Handlebars (a Mustache variant) and write the HTML myself.

  > Thirdly, that "XHTML is easy" assumes that everybody has
  > embraced HAML or Jade or other sugary templates. I
  > personally find they all suck horribly, so I use
  > Handlebars (a Mustache variant) and write the HTML
  > myself.
You don't have to use weird template languages; standard markup is fine, assuming your template library supports it. For example, my site's templates are written in XHTML 1.1, which is automatically converted to HTML4 for older browsers.
If I'm reading that page correctly, then abc.com is using User-Agent instead of Accept to determine whether the client supports XHTML. This seems like a really bad idea; what happens if (for example) someone uses user-agent spoofing to get desktop sites, and ABC serves up a mimetype that it can't handle?

Additionally, I'm baffled as to how any reputable site can actually serve invalid XHTML. The template engine should have logic similar to "if the client accepts XHTML, send XHTML, otherwise send HTML". To exhibit the behavior seen in this post, ABC's site must instead do something like "send the client HTML. if its user agent matches <regex>, claim you sent XHTML".

Somewhere in the bowels of ABC's IT department, there's a very confused intern with a copy of Web Programming for Dummies.

Even Web Programming For Dummies shouldn't suggest that. This is obviously the work of somebody reading a random $5 book from 2001, written by Newbie McRandomDude.
Why is anyone still using XHTML? I thought HTML5 had mercifully obsoleted that.

I for a long time bought into XHTML, just to find out that XHTML documents render slower than plain ol' HTML ones, and pretty much nobody was able to produce proper XHTML anyway.

XHTML was merged into HTML5, and renamed XHTML5.

Personally, I'm still using HTML4/XHTML1, and will continue to do so until HTML5 has stabilized and can be validated.