Hacker News new | ask | show | jobs
by toast0 3622 days ago
Never trust user input doesn't mean never use user input; you need to use it carefully -- often that means restricting to acceptable values and lengths, (appropriate!) escaping, and passing through that it's user input to other functions (ex: sql placeholders). When functions do arcane things, and don't let you pass through that it's user provided, that's a red flag.

Note that a cookie that you set, and are now getting back IS user-input, unless you do something to validate that it's actually the value you set. (HMAC is a good start)

If your language can't deal with strings properly, I strongly suggest you not expose it to strings provided by users. If you do expose it to strings from users, at least you should sandbox your application as much as possible.

1 comments

I advised someone doing their masters in information security as a mentor. My student did their dissertation on input scrubbing. We did quite extensive research on the subject, and we found out that a simple AWK program doing regular expression matching on the input, before passing it on to conventional scrubbers inside of languages like PHP virtually eliminated attack vectors. For three months we tried our very best to craft some SQL code to get by the AWK regex and we couldn't. Lesson learned.
> Lesson learned.

Without meaning to be sarcastic (particularly because I found your post interesting), what lesson learned? A casual perusal of your post suggests the lesson "one can't craft SQL code to get by an AWK regex", but, of course, "what I can't do no-one can" is a bad lesson to learn in security.

The lesson we learned is that sometimes getting back to the roots (AWK) and using simple methods (regex) can be extremely effective. You are of course right that "what I can't do no-one can" is a bad thing.
IMO your lesson is to get to define a problem simply enough that you can apply a simple solution. This is not a given and usually needs serious design and project management skills.

Otherwise even your simple solution would be drawn in "can you support multi-byte characters ? Do you handle non unicode stuff ? What if it leaks in your layers if code before reaching your awk library ?" and other problems that abound in most mildy complex projects.

your lesson is to get to define a problem simply enough that you can apply a simple solution.

Hear hear! So true. The problem is that making complex things simple is extremely difficult.