|
|
|
|
|
by clarkevans
4687 days ago
|
|
I should have used properly "construct input" (as suggested by dragon-writer) rather than "escape". If possible, you'd use query parameters or some other reusable and verified mechanism to encode user content. If you're embedding a data stream of one type within another, you still have to manage boundaries to encode content properly. I'd note for database languages, query construction is often two phase, where you build your query template based on selection/filtering needs, and then, use parameter substitution with user-provided content. I've seen experienced programmers do both at the same time and resort to manual escaping of user content rather than seeing them as distinct phases. I think it's about managing the movement of information between contexts... one process' data is another process' code. The most important context switch happening between untrusted inputs and the rest of your system. I apologize for deleting my other comment; I was trying to shorten it. |
|
You make good points, and I'm trying to figure out exactly how your points and mine hook together.
I think what it ultimately comes down to is distinguishing between data and code. More specifically, data can contain code, but code should never contain (foreign) data. So escaping in general is fine, but escaping in this context, where you're building a string of code, is no good.
That, I think, is why I'm comfortable with e.g. the escaping done to a string when emitting JSON, but not with escaping a string to put it into an SQL query.
With parameterized queries, you're still ultimately passing the code and data over to the database over some sort of stream (assuming an out-of-process database server), but hopefully that stream is designed as data containing both the parameterized SQL code and the parameters, rather than just SQL code with escaped parameters.