Hacker News new | ask | show | jobs
by reedloden 3621 days ago
> There are always practical limitations to site-wide technical changes, and HTTPS Everywhere is no different. Sites and content we consider ‘archival’ that involve no signing in or personalisation, such as the News Online archive on news.bbc.co.uk, will remain HTTP-only. This is due to the cost we’d incur processing tens of millions of old files to rewrite internal links to HTTPS when balanced against the benefit.

Not to be snarky, but haven't people written tools to help with this? This seems like a common issue. I mean, there's `sed` and similar tools, obviously, but something that could go, validate that the link works over https://, and update it. I don't see why that would need to be some monumental amount of work.

HTTPS is more than just privacy. See https://certsimple.com/blog/ssl-why-do-i-need-it and https://www.troyhunt.com/ssl-is-not-about-encryption/

6 comments

  haven't people written tools to help with this?
Let's say you have a web page with a javascript slippy map that imports openlayers from a CDN; and openlayers then retrieves map tiles from openstreetmap.

If you serve that page over https but the javascript CDN url is http, the javascript library won't load. And if the js CDN supports https and you switch to it, the library might still compose a http URL to retrieve the map tiles - causing some browsers to block the tiles as mixed content. Other browsers are willing to load http images on https pages and will work. Unless the tool understands how the map library composes its URLs, someone will have to fix this manually.

To detect bugs like that automatically, after changing to https you'd have to spider every page in your site with several different browsers / browser configurations looking for errors and bad links. And if your archived site had a bunch of errors and bad links to start with, you'll need some way to compare the before-and-after error reports too.

TLDR: It can be more complicated than you think.

Plus... they don't have to do this.

They could put in place redirects, and then use HSTS to tell browsers to only visit the HTTPS links.

They could leave the old HTML unprocessed and pointing at HTTP and HSTS will fix it for modern browsers.

Only the first request would be via HTTP, and Chrome and other browsers can be told to use HTTPS when they see the links even then: https://hstspreload.appspot.com/

> Not to be snarky, but haven't people written tools to help with this? This seems like a common issue. I mean, there's `sed` and similar tools, obviously, but something that could go, validate that the link works over https://, and update it. I don't see why that would need to be some monumental amount of work.

Not as trivial as you'd think: if there's an HTTP URL on the page when it should be HTTPS, how did the URL end up there? Dynamically from PHP code? Dynamically from JavaScript code? Did the URL come from a database? Did the URL come from an environment variable? It can be a lot of work to track all these down and a lot of them you won't be able to find using grep/sed e.g. URLs might appear as relative URLs in code with the "http" part being added dynamically.

You'll get insecure content warnings as well if you try to load HTTP images, css, iframes or JavaScript on an HTTPS page. Likewise, the URL for these can come from lots of places.

I think this trivializes the scope of what the BBC developed. Even with well automated processes, you'd still want a human doing light QA given the wide diversity of content. The BBC has been at it for over twenty years building ad hoc minisites[1]--sites so far down the long tail, that if forced to choose, they may be more prone to pull the plug than to maintain.

[1] http://news.bbc.co.uk/nol/ukfs_news/hi/uk_politics/vote_2005...

Sites and content we consider ‘archival’ that involve no signing in or personalisation,

AUGH! Seeing this "SSL is just for private things" mindset in 2016 is really disheartening. It's to keep people from screwing with your connection, not just snooping on it.

I really hope the browser vendors start treating HTTP the same way they treat broken certs sometime soon. This will change once users start asking, en masse, "Why am I getting all these warnings", not before.

Pretty sure a diluted form of the broken cert treatment for HTTP is available behind a flag in Chrome, so it might be in the pipeline.

Source: http://peter.sh/experiments/chromium-command-line-switches/

See:

    --mark-insecure-as
I don't think you can just run 'sed' on any random iOS app, any random symbian app, any random smart-TV app, some other guy's service that hits your APIs and feeds, and so on... :)
Now, that could be a valid issue, indeed, though not sure for how long I care about those devices continuing to work without any valid upgrade path... Using things like HSTS and CSP's `upgrade-insecure-requests` would help here for clients that do support it.
You might not care, but the BBC does — it's one of the issues they mention in the blog post.

If the BBC "channels" stopped working, but other providers' content continues to work, the BBC would be blamed.