Hacker News new | ask | show | jobs
by nl 1325 days ago
Too bad if your consumer has to interact with anything that could be malicious in any circumstances!

Consumers must properly escape any input.

2 comments

That doesn't make sense to me and I agree with GP. If I consume HTML and I escape all HTML input I'm given, I'm utterly useless.

Now when I consume text and convert that text into HTML for further treatment, I'm producing HTML, and I must properly escape my input in that conversion. The escaping is only needed because I produce HTML. In fact the only time escaping can be done is when producing data, because if unescaped data is ever produced, the cat's out of the bag.

Edit: Actually think that producer/consumer is a wrong way to talk about this. Escaping only ever occurs at a boundary when transforming between formats (eg from "text string" to "html string") which is always both producer (of the new format) and consumer (of the old format). But it can always be thought of as a type cast, with possible type confusions when input and output formats share the same machine representation (eg string).

> That doesn't make sense to me and I agree with GP. If I consume HTML and I escape all HTML input I'm given, I'm utterly useless. [...] Now when I consume text and convert that text into HTML for further treatment, I'm producing HTML, and I must properly escape my input in that conversion.

Which is my point, it's the consumption side which defines what the escaping should be.

> Escaping only ever occurs at a boundary when transforming between formats (eg from "text string" to "html string") which is always both producer (of the new format) and consumer (of the old format).

A database interface is not a transformer / producer, needs escaping. Globbing is not a transformer either. Still needs escaping.

I disagree, a database interface is a format boundary at which a transformation occurs (from text to SQL) and so is globbing (from text to pattern).
Whatever, call it a transformer if that makes you hard.

Point doesn't change: what escaping is needed is a function of the "transformer" and applied to the input (= consumption) side of it.

You don't apply an escaping because data comes from a database, you apply it because it goes into one. Same with template processors, regex engines, etc...

Sure, redefine the terms if you like.

The thing that accepts the input must make sure it is properly escaped. Think of SQL injection attacks - they are because the thing that accepts input hasn't properly escaped the input.

Cross site scripting attacks are exactly the same thing but occur when the input side doesn't properly escape HTML input.

I suspect you are having a vigorous debate about the ill defined “producer consumer” terminology and probably agree.