Hacker News new | ask | show | jobs
by TheChaplain 703 days ago
Disagree.

Escaping/sanitizing on output takes extras cycles/energy that can be spared if the same process is done once upon submission.

Think more sustainable.

6 comments

Yes I love seeing < in my database. Every time I see it I think oh boy how many cycles will I save when I display this in HTML!
It’s a trade off for sure. But if you know at the time of collection all the ways in which your data is going to be used in the future, you work at a more well-run organization than me!
Obviously one thing does not exclude the other, but the more processed data is before, the less has to be done later for every page retrieval.

It surprises me that this seem unfamiliar these days?

That is true only for simple systems that don't change often and that have a single view of the data.

For most real world use cases your approach is break even at best, and often way worse.

You mean it's not suitable for the deploy every hour crowd it has nothing to do with complexity.
No, that's what caches are for.
What do you escape for? Html? Postscript? Json? SQL? Latex? Custom report generator which haven't been created yet but you'll start using in 2 years? Things move - you can't choose the escaping format at input time.
Solve problems you have, not ones you invent
Html, Json, SQL are the very basics which most web apps will run into. You don't have to invent anything to get more than one possible output format.
For SQL you already have the language provided sanitation of prepared statements. For most backends, the output format is always json, which ends up on the frontend via dedicated browser APIs that don't allow html injection.

Maybe if you also directly render some html from the backend, that would change things.

Document these assumptions in your central code standards/architecture document to get everyone aligned, and then just stick to it and enjoy a more sane codebase.

So what you're proposing is encode to html on input, to sql on output, passthrough to html, and encode to Json on output but after decoding from the saved html - and just document it well?
I'm not sure, I've never run into this issue. I'd probably just look for another job if you want the honest answer.

For what it's worth: if a column in your database is used in so many contexts, you could document that it's simply unsanitized. Like, I prefer having as few entries for bugs and fuckups, and knowing that the database data is always sanitized helps with that. But it's not an unbreakable, absolute sort of rule.

And I never really escape strings for SQL, as I always use prepared statements - forgetting prepared statements is an easy thing for code review to catch.

I also have a project where there is HTML stored, which is not escaped because the purpose is to directly output that HTML somewhere, and it really is supposed to be HTML.

It simply depends. But as long as it's documented, and hopefully as clear and consistent as possible.

Not sure the escaping/sanitizing proposition can hold a candle to the overwhelming performance dumpster fire that is modern web dev.
I was going to say this sounds like optimizing the stuff that takes 0.1% of runtime for performance over safety.

Of course you'd need to measure this for your application, but without a performance measurement maybe it's better to default to security.

Did you calculate that, or are you just guessing?