Hacker News new | ask | show | jobs
by hombre_fatal 422 days ago
Server-side sanitization means that your view code is inherently vulnerable to injection. You'll notice in modern systems you don't sanitize data in the database and you don't have to manually sanitize when rendering frontend code. It's like that for a reason.

Server-side sanitization and xss injection should be left in the 2000s php era.

1 comments

Where do you suggest we sanitize values? Only in the client, when rendering them?
Depends on what you mean by sanitising.

If you mean filtering out undesirable parts of a document (e.g. disallowing <script> element or onclick attribute), that should normally be done on the server, before storage.

If instead you mean serialising, writing a value into a serialised document: then this should be done at the point you’re creating the serialised document. (That is, where you’re emitting the HTML.)

But the golden standard is not to generate serialised HTML manually, but to generate a DOM tree, and serialise that (though sadly it’s still a tad fraught because HTML syntax is such a mess; it works better in XML syntax).

This final point may be easier to describe by comparison to JSON: do you emit a JSON response by writing `{`, then writing `"some_key":`, then writing `[`, then writing `"\"hello\""` after carefully escaping the quotation marks, and so on? You can, but in practice it’s very rarely done. Rather, you create a JSON document, and then serialise it, e.g. with JSON.stringify inside a browser. In like manner, if you construct a proper DOM tree, you don’t need to worry about things like escaping.

What's wrong about filtering before saving, is that if you forget about one rule, you have to go back and re-filter already-saved data in the db (with some one-off script).

I think "normally" we should instead filter for XSS injections when we generate the DOM tree, or just before (such as passing backend data to the frontend, if that makes more sense).

Don't forget that different clients or view formats (apps, export to CSV, etc) all have their own sanitization requirements.

Sanitize at your boundaries. Data going to SQL? Apply SQL specific sanitization. Data going to Mongo? Same. HTML, JSON, markdown, CSV? Apply the view specific sanitizing on the way.

The key difference is that, if you deploy a JSON API that is view agnostic, that the client now needs to apply the sanitization. That's a requirement of an agnostic API.

Please don’t use the word sanitising for what you seem to be describing: it’s a term more commonly used to mean filtering out undesirable parts. Encoding for a particular serialised format is a completely different, and lossless, thing. You can call it escaping or encoding.
Sanitizing is just a form of encoding that prevents data from becoming executable unintentionally.