Hacker News new | ask | show | jobs
by masklinn 1324 days ago
> They're all unsafe, because you have no clue what context they're going to be used in.

That's correct, but it's the reverse thinking from the escaping one.

Because in the escaping one, when you need not to escape you will also not-escape at the last possible moment, and that's a sure-fire way to launder attacker-controlled data.

Instead you should escape everything, and opt-out as early as possible.

> But if you're writing a webapp, passing around escaped strings is a bad idea 99% of the time. It creates code highly coupled to one aspect of your system.

That's why you do the reverse: most strings are unsafe to everything, but the strings which are safe are generally safe to one specific subsystem. So you say that.

> Just imagine if you did this with networking. I'm glad we're not in a world where we're passing around TCPString or UDPString or IPString or EthernetString or TokenRingString or CarrierPigeonString because that happens to be a networking stack the app uses sometimes. It sounds like hell.

It sounds like hell because it makes no sense, there's no such thing as a TCPString because TCP is not string-based and TCP messages are not composed that way.

1 comments

> Instead you should escape everything, and opt-out as early as possible.

That’s not even remotely workable for any system with more than one kind of “escaping”. What if I want to use a string as:

1. An IDNA-encoded domain name

2. An HTML text snippet

3. A shell command string argument

4. A string literal part of a regular expression

5. A part to be used in an XML CDATA section

6. A JSON string

I can’t escape the string beforehand, since the escaping rules are all different. No, the only sensible alternative is to use the same rule which we all use for character encoding: Encode and decode (and escape) at the edges.

> I can’t escape the string beforehand, since the escaping rules are all different.

You’re still misunderstanding. You shouldn’t escape at any point, instead you should mark things as safe as early as possible.

“Safe” almost always has a single context, you don’t care if it’s going to go somewhere else because it’s not safe for there.

Anything that’s not marked as safe is then automatically considered unsafe and processed as such by the sink.

> What if I want to use a string as:

It’s not an issue, because by default nothing is safe anywhere, so all those APIs should treat the injected data thus.

There is no escaping, because everything is automatically internally escaped by default.

> It’s not an issue, because by default nothing is safe anywhere, so all those APIs should treat the injected data thus.

No library does this, since it does not know what strings I send it with their literal meaning intended, and which strings I send it with their escape characters intended to be interpreted. The escape characters are part of the API of that library. The library does not accept “strings” as such, it accepts “escaped” strings. And since my program deals with normal unescaped strings, I have to escape the strings before I send them to the API.

> There is no escaping, because everything is automatically internally escaped by default.

I have a feeling that you have a different meaning of the word “escaped” than me.

> No library does this

Most modern templates do exactly that. Jinja certainly does.

> The library does not accept “strings” as such, it accepts “escaped” strings. And since my program deals with normal unescaped strings, I have to escape the strings before I send them to the API.

That’s the problem with the library. That is what needs to be fixed.

> I have a feeling that you have a different meaning of the word “escaped” than me.

Add “explicit” to the first occurrence if you don’t understand without it.