Hacker News new | ask | show | jobs
by jerf 3327 days ago
I would suggest specifying titles as html, not plain text. I've seen too many things titled "I <i>love</i> science!" over the years to believe in the idea that titles are plain text.

Also, despite the fact this is technically not the responsibility of the spec itself, I would strongly suggest some words on the implications of the fact that the HTML fields are indeed HTML and the wisdom of passing them through some sort of HTML filter before displaying them.

In fact that's also part of why I suggest going ahead and letting titles contain HTML. All HTML is going to need to be filtered anyhow, and it's OK for clients to filter titles to a smaller valid tag list, or even filter out all tags. Suggesting (but not mandating) a very basic list of tags for that field might be a good compromise.

2 comments

Allowing HTML means the other side will have to validate that HTML (to avoid XSS). Using text means you can stick in the DOM using innerText() and be much more confident that you aren't injected XSS.

I agree that I see HTML in RSS titles, but I rather have the occasional garbled title that the author can fix by striping out HTML before the RSS than ensuring that every RSS reader isn't opening up new security holes.

There is no way to avoid having to handle HTML safely. There's no point in trying to limit your exposure to that problem when the entire point of this standard is to ship around arbitrary HTML for interfaces to display. Once you've solved the hard problem of displaying the body safely, displaying the title is trivial. Making the title pure text does nothing useful. JSONFeed display mechanisms that are going to get this wrong are going to do things like leave injections in the date fields anyhow.
Following the separation of content_text and content_html attributes, it would make sense to have title_html and title_text attributes.