Hacker News new | ask | show | jobs
by cxr 115 days ago
> it's not at all clear which is which from the names. Ideally you design that in from the [start]

It was, and there is: setting elementNode.textContent is safe for untrusted inputs, and setting elementNode.innerHTML is unsafe for untrusted inputs. The former will escape everything, and the latter won't escape anything.

You are right that these "sanitizers" are fundamentally confused:

> "HTML sanitization" is never going to be solved because it's not solvable.¶ There's no getting around knowing whether or any arbitrary string is legitimate markup from a trusted source or some untrusted input that needs to be treated like text. This is a hard requirement.

<https://news.ycombinator.com/item?id=46222923>

The Web platform folks who are responsible for getting fundamental APIs standardized and implemented natively are in a position to know better, and they should know better. This API should not have made it past proposal stage and should not have been added to browsers.

1 comments

> There's no getting around knowing whether or any arbitrary string is legitimate markup from a trusted source or some untrusted input that needs to be treated like text. This is a hard requirement.

It is not a hard requirement that untrusted input is "treated like text". And this API lets you customize exactly what tags/attributes are allowed in the untrusted input. That's way better than telling everyone to write their own; it's not trivial.

It is not a hard requirement that untrusted input is "treated like text".

It's also not a hard requirement that I defend the position that there's a hard requirement for untrusted input to be treated like text. That isn't my position, and it's not what I wrote.

Given that it is not a hard requirement that untrusted input be treated like text, it wouldn't make sense for anyone to claim that it is—and therefore it doesn't make sense for someone, presented with I did write, to strenuously argue with me that such a tortured, implausible, uncharitable, non-sensical interpretation of what I wrote was something that I have to account for (versus the interpretation that does match what I wrote and is actually true and makes sense).

You are, willfully or not, misconstruing what I have written.

> That's way better than telling everyone to write their own; it's not trivial.

Right, it's not trivial. It's so far the opposite of trivial that it's (as I said the first time—and again, just now) not solvable.

No one should be writing their own.

No one should be trying to write their own.

No one should be using this API at all.

And no one should have pushed for its implementation.

It's a bad API.

I thought you were done talking to me?

Briefly though, if you have an untrusted string then you need to either treat it like text or sanitize it. I don't see any other options.

So if people shouldn't use this sanitizer or write their own, then the only option left is treating the string as text. But you're vehemently arguing that's not what you said.

What's the other way to use an untrusted string? Other than "don't", but that means not taking input and only works for toy apps.

So "willfully".