Hacker News new | ask | show | jobs
by rezonant 1057 days ago
> Sanitization is straight forward to implement

I would not say it's easy. Considering your adversaries are very motivated to do XSS and the web platform is very complicated.

> It could be argued that automatically sanitizing everything including already safe data types like numbers and system-generated content adds an unnecessary performance overhead for certain projects.

I don't think there's a substantial performance loss from doing a type check on a value to see that it's a number, and then putting it verbatim into the output (within your sanitization code).

I don't know what "system generated content" is, and I'd argue that neither does a framework. Which means the far safer route is to assume it came from a user by default and force the dev to confirm that it's not from the user.

> Loops are also trivial; you can simply use Array.prototype.map function to return a bunch of strings which you can incorporate directly into the main component's template string

Combined with the "it's fine" mentality on data sanitization, it's concerning that we're using the term "string" in relation to building DOM nodes here. I hope we aren't talking about generating HTML as strings, combined with default-trusted application data that in most applications, does in fact come from the user, even if you might consider that user trusted (because it's Dave from Accounting, and not LeetHacker99 from Reddit).

2 comments

By "system generated content" I meant content which is not derived from potentially unsafe user input. For example, if your front end receives a JSON object from your own back end which was generated by your back end and contains numbers, booleans and enums (from a constrained set of strings) and it is properly validated before insertion into your DB, such data poses no risk to your front end in terms of XSS. That said, if you want to make your system fool-proof and future-proof, you could escape HTML tags in all your string data before incorporating it into a components' template string as a principle; such function is trivial to implement.

The main risk of XSS is when you inject some unescaped user-generated string into a template and then set that whole template as your component's innerHTML... All I want to point out is that not every piece of data is a custom user-generated string. Numbers, booleans don't need to be escaped. Error messages generated by your system don't need to be escaped either. Enum strings (which are validated at insertion in the DB) also don't really need to be escaped but I would probably escape anyway in case of future developer mistake (improper validation).

I agree that the automatic sanitization which React does is probably not a huge performance cost for the typical app (it's probably worth the cost in the vast majority cases) but it depends on how much data your front end is rendering and how often it re-renders (e.g. real time games use case).

> and it is properly validated before insertion into your DB, such data poses no risk to your front end in terms of XSS

This is making a lot of assumptions. Just because the data was acceptable in a database table does not mean it doesn't pose an XSS risk.

Bear in mind, in other branches of this discussion we're talking about using DOM text APIs to insert. Certainly that is a good, reliable way to avoid XSS, but you can consider that to be value sanitization just done for you by the browser. In the absence of that, advocating that "if it comes from the API it is safe" is a dangerous thing to advocate for.

The title "A world where <HTML> tag is not required for your web pages" might be perfectly valid to submit into your blog's CMS system, but that in no way means you can skip processing that in the frontend because "it is safe". Plenty of what you are saying is reasonable, but I think the topic requires a little more nuance in order to speak about the topic responsibly.

You get sanitization for free by using built in browser methods like setAttribute and textContent=.
Agreed, this is the safe approach if you create elements using document.createElement(). For cases where you want to generate some HTML as strings to embed within your component's template string (e.g. in a React-like manner using Array.prototype.map), you would have to escape the variables provided by the back end in case they contain HTML tags which could be used as an XSS attack.

Although such sanitization function is trivial to implement... In my previous comment, I mentioned using document.createElement() as a fallback if in doubt. It's safe to create the elements with the DOM API and using the textContent property as you suggest. That's why I don't see sanitization as a strong excuse to avoid using plain Web Components.

I agree that sanitization isn't an excuse not to use web components. Only that brushing off sanitization as solved by web components is dangerous rhetoric.
Of course, but I don't think the parent poster is talking about that.