Hacker News new | ask | show | jobs
by squiggleblaz 2267 days ago
> HTML encoding, URL encoding or JavaScript escaping and escaping a safe way is highly context-dependent (I've seen an unescaped "\n" cause injection within JavaScript contexts)

I have had a hard time convincing co-workers that if you have php generating sql generating (! yes!) html generating javascript, you need to escape the string for javascript since it's embedded in javascript. Then you need the string escaped for html since it's embedded in html. Then you need the string escaped for sql since it's embedded in sql. Only then can you chuck it into the middle of the string. It is better to not do such craziness; but once you've decided to do such craziness, you must do it properly. The similarities between js and mysql escaping are irrelevant; it must be escaped properly each time it is embedded in another language.

1 comments

Escape characters are one of the most stupid things in the computing world.

The formats could be so simple: first the length of the data, then raw data of that length

The better solution is allow control over what the terminating characters are. Rust raw strings allow N ‘#’ characters around them, like r”x”, r#”x”, r#####”x with “quotes””#####. So no escapes are ever needed, just increase the number of hashes to outstrip the maximum consecutive hashes appearing in the string after a quote character.
But then we would never be able to treat text as a stream. All of your one pass algorithms would become two pass.
I’m curious what you mean by this — why does coding for length prevent streaming? The receiving end can certainly treat the text as a stream still. Do you mean that the sender cannot “stream” if they are generating the contents on the fly? That seems trivial to solve - just break it up into chunks, assuming a little bit of buffering is OK.