Hacker News new | ask | show | jobs
by teddyh 1325 days ago
> Instead you should escape everything, and opt-out as early as possible.

That’s not even remotely workable for any system with more than one kind of “escaping”. What if I want to use a string as:

1. An IDNA-encoded domain name

2. An HTML text snippet

3. A shell command string argument

4. A string literal part of a regular expression

5. A part to be used in an XML CDATA section

6. A JSON string

I can’t escape the string beforehand, since the escaping rules are all different. No, the only sensible alternative is to use the same rule which we all use for character encoding: Encode and decode (and escape) at the edges.

1 comments

> I can’t escape the string beforehand, since the escaping rules are all different.

You’re still misunderstanding. You shouldn’t escape at any point, instead you should mark things as safe as early as possible.

“Safe” almost always has a single context, you don’t care if it’s going to go somewhere else because it’s not safe for there.

Anything that’s not marked as safe is then automatically considered unsafe and processed as such by the sink.

> What if I want to use a string as:

It’s not an issue, because by default nothing is safe anywhere, so all those APIs should treat the injected data thus.

There is no escaping, because everything is automatically internally escaped by default.

> It’s not an issue, because by default nothing is safe anywhere, so all those APIs should treat the injected data thus.

No library does this, since it does not know what strings I send it with their literal meaning intended, and which strings I send it with their escape characters intended to be interpreted. The escape characters are part of the API of that library. The library does not accept “strings” as such, it accepts “escaped” strings. And since my program deals with normal unescaped strings, I have to escape the strings before I send them to the API.

> There is no escaping, because everything is automatically internally escaped by default.

I have a feeling that you have a different meaning of the word “escaped” than me.

> No library does this

Most modern templates do exactly that. Jinja certainly does.

> The library does not accept “strings” as such, it accepts “escaped” strings. And since my program deals with normal unescaped strings, I have to escape the strings before I send them to the API.

That’s the problem with the library. That is what needs to be fixed.

> I have a feeling that you have a different meaning of the word “escaped” than me.

Add “explicit” to the first occurrence if you don’t understand without it.