Hacker News new | ask | show | jobs
by pornel 4882 days ago
That's exactly the kind of security risk that the article is talking about. Internet Explorer could be tricked to use US-ASCII encoding and interpret ¼script¾ as a script tag (CVE 2006-3227)

Liberal vs strict is a false dichotomy. The third solution is to accept all possible inputs, but in a specified way.

Instead of taking draconian XML approach you can solve the problem by taking HTML5 approach and make error handling as interoperable as handling of correct input. In case of STEP files you could require all implementations to clear the 8th bit (or drop or clamp bytes out of range — whatever as long as it's specified and mandatory).

1 comments

Maybe I'm missing something here, but a valid STEP string can already encode any arbitrary Unicode code point. It just does it using 7-bit ASCII. If your code is somehow executing these strings without examining their content, then you are already in big, big trouble.

Trying to do something with 8-bit characters -- whether skipping them, indicating an illegal character in the string, or trying to guess what was really meant -- cannot make that situation any worse.

The problem is if you decode a particular byte sequence that causes a bad action (if that's possible with step files) in a different way than some other program that is supposed to keep you safe.

In the case of ie, ie decoded one way and forum software might decode a different way. So the forum software says the string is safe for the browser (according to its decoding rules) but then the browser applies different rules and gets a bad string.

You may not be seeing the danger because you implicitly think a step file from unsafe sources is always unsafe. But imagine if you had a safe file detector program, except it applied different rules than the program you're actually going to open the file with.

As jbert pointed out, if your program's main job is to say whether or not something is safe, and it liberally says "Oh yeah, I think that's safe", that's pretty much the exact opposite of "be conservative in what you do".
Please explain the proper way of escaping/rejecting html in forum posts, when you can't rely on the browsers following the spec.