Hacker News new | ask | show | jobs
by geocar 137 days ago
I think the problem is what is an image?

I made an attempt to enumerate them[1], and whilst I catch this issue with feImage over a decade ago by simply observing that xlink:href attributes can appear anywhere, Roundcube also misses srcset="" and probably other ways, so if the server "prefetched every image" it knew about using the Roundcube algorithm the one in srcset would still act as a beacon.

I feel like the bigger issue is the W3 (nee Google). The new HTML Sanitizer[2] interface does nothing, but some VP is somewhere patting themselves on the back for this. We don't need an object-oriented way to edit HTML, we need the database of changes we want to make.

What I would like to see is the ability to put a <pre-cache href="url"><![CDATA[...]]></pre-cache> that would allow the document to replace requests for url with the embedded data, support what we can, then just turn off networking for things we can't. If networking is enabled, just ignore the pre-cache tags. No mixing means no XSS. Networking disabled means "failures" in the sanitizer is that the page just doesn't "look" right, instead of a leak.

Until then, the HTML4-era solution was a whitelist (instead of trying to blacklist/block things) is best. That's also easier in a lot of ways, but harder to maintain since gmail, outlook, etc are a moving target in _their_ whitelists...

[1]: https://github.com/geocar/firewall.js

[2]: https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...

2 comments

Why on earth does the HTML sanitiser allow blacklisting?! That can't ever be safe to use, the set of HTML elements can always change.
Note that the API is split into XSS-safe and XSS-unsafe calls. The XSS-safe calls [0] have this noted for each of them (emphasis mine):

> Then drop any elements and attributes that are not allowed by the sanitizer configuration, and any that are considered XSS-unsafe (even if allowed by the configuration)

The XSS-unsafe functions are all named "unsafe". Although considering web programmers, maybe they should have been named "UnsafeDoNotUseOrYouWillBeFired".

[0] https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...

I mean, at least they eventually came to their senses, but it does not inspire confidence!

https://developer.chrome.com/blog/sanitizer-api-deprecation/

That's the old sanitizer API. That was already removed and what you linked earlier is the new sanitizer API.
> What I would like to see is the ability to put a <pre-cache href="url"><![CDATA[...]]></pre-cache> that would allow the document to replace requests for url with the embedded data

multipart/related already exists.

> multipart/related already exists.

Which web browsers render multipart/related correctly served over https?

What is stopping them from doing so instead of going with a NIH solution?

Never mind the context is e-mail, which is not served to a browser over HTTPS.

Got it: So none.

As to why I prefer one thing that doesn’t exist over another thing that doesn’t exist depends on my priors. You might as well be asking my opinion and making fun of it before you know the answer.

What do you think the impact would be if Content-Location: would be if it suddenly gained the interpretation I suggest?

What do you think a script in the package can do to reference a part of the URL is constructed by code?