Hacker News new | ask | show | jobs
by danielheath 703 days ago
Our approach at work: parse it as HTML, define a short list of known-acceptable tags & attributes, and strip everything else.

Limiting attributes to ["href", "src"] and tags to ["p", "br", "h1", "ul", "ol", "li", "span", "div", "img"] gets you remarkably close to rendering the safe bits of HTML - add to that list upon request.

If you want to take it further, use an `iframe srcdoc=""` with sandbox attributes set.

1 comments

> Limiting attributes to ["href", "src"]

You need to clean that up as well to avoid e.g. javascript: links, and then there are more issues with SVG if you allow media uploads.

Then you need to be very sure you’re using a proper html5 parser and your rendering is completely canonicalized or you open yourself up to filter evasions (https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Ev...)

And of course I assume that’s what you meant but you should not add upon request, you should evaluate the addition.

Yes - just double checked those, thankfully the framework builtins are correct (staying up to date with a well maintained framework does wonders for your security posture).
Wasn't there this case of a security issue coming from abusing different parsers, in different places? Server, client, or different browsers