Hacker News new | ask | show | jobs
by waqas- 3454 days ago
im a lead dev for a large publisher. when we switched over to https we faced the following non-trivial issues:

1. a lot of third party advertisers/ad servers still run on http, these ads need to be embedded via DFP usually, which you can understand does not work out well. We made the switch months ago, to this date i am still making advertisers switch to https.

2. google says they give better ranking to https sites, thats simply not true so far as i have seen. in fact, in the short run your site takes a hit. not only that, in google webmaster console and google news, you cant shift from http to https, you have to make new accounts for ur https sites. to this day i do not know which ones is google crawling. For google news, my new https account has yet to be approved after months, it looks like google just magically shifts to https in google news. but if feels icky and hacky: explicit is always better than implicit.

3. microservices. remember those microservises that were all the rage? well, its a bunch of different servers and subdomains, you have to shift all to https when you shift the mothership to https.

while above points are valid, i still pushed in my org to shift to https. we now use shiny stuff like http2 and web push, which is awesome. i'd recommend all publishers to do so. but its understandable that management finds all this scary, esp cuz its sounds like a major overhaul of your web assets (which is everything when ure a web publisher) - even though it isnt really actually an overhaul or anything.

7 comments

> 1. a lot of third party advertisers/ad servers still run on http, these ads need to be embedded via DFP usually, which you can understand does not work out well. We made the switch months ago, to this date i am still making advertisers switch to https.

So not only the ads collect personal data, slow down web experience and make every non-adblocked site a pain to watch... you go the extra step to not even encrypt transfer due to them?

Adblocking in 2016: using a HTTP firewall, blocking all HTTP requests by default.
How wonder how much you could learn about a person based on the ads they get served?
I don't get point 3. Microservices are for the backend usually, right? You don't want multiple TCP/HTTP[S] connections from the client to all your services - pointless overhead. Worst case scenario, if you need direct client-microservice connectivity, then throw all the services behind nginx and terminate SSL there.
im talking about when microservices expose apis consumed via ajax. then https-http connections dont work.
As the parent suggested I would terminate the HTTPS connection in an Nginx in front of all your microservices. No microservice needs to handle HTTPS then.
I thought it was recommended to use https between microservices for all communication? Otherwise the user might think their data is encrypted even though it travels plain-text through networks after the first server, as not all services will run in a separated network.
That's entirely up to the app. With Cloudflare it's not even normal for HTTPS to mean your data got to the server encrypted.

And anyways you can solve it in the same way: nginx LBs to terminate SSL internally.

Maybe it's such a large publisher that they need separate Nginx instances in front of each of the micro-services.
That goes against the basics of load balancing... And it shouldn't even be an issue to have multiple Nginx instances with similar HTTPS configs.
Honestly, #3 can be solved completely by putting your own HAproxy or Nginx reverse proxy in front of the microservice or API. You shouldn't be directly exposing your containers to clients anyway - you want a reverse proxy or load balancer to be able to throttle the traffic and provide basic security.

Also, if you have an ad network that can't serve https, just stick another reverse proxy like HAproxy in place and convert their http content into https for your clients. It works very easily for that type of thing as well - in fact, you can even do path-based routing like https://www.mysite.com/ad/* goes to http://ad-network.com (while https://www.mysite.com goes to your regular back-end), and HAproxy will do all the rewriting for you.

Ad networks generally won't let you do that. To them it looks like click fraud.

Also, most ads are now html5 with many resources and scripts etc. for interactivity. You can't just host it on another domain - there will likley be lots of internal absolute paths and paths assembled with JavaScript that one can't practically rewrite.

Remember most sites use an ad-exchange, so we aren't talking fixing up one or two ads here - there are probably over 1 billion unique creatives to fix up.

This is really sad and disturbing to read. Especially the ad networks side of things. But I really hope that people like you pressure these advertisers into providing HTTPS. It's not this "hip new web tech".

It's an established standard and now completely free to set up (thanks to Let's Encrypt ;)). I don't see how these two (yes, frankly, two reasons. let's really ignore the third one) can keep anyone from providing HTTPS connections to their dear readers and users.

> now completely free to set up (thanks to Let's Encrypt)

To be fair, I would expect any reasonably large company to obtain an EV (Extended Verification) certificate, which is very much not free. Let's Encrypt only does DV (Domain Verification). EV includes verification that you're actually the incorporated company that you're telling everyone you are.

Are you using proper forwards from your old HTTP to your new HTTPS URLs? Wired also took a seo hit and this was probably their problem.

If you use permanent forwarding (HTTP 301) then you should be fine.

yes we have permanent 301 forwards set up since day 0. but still it happened in the short run.
> google says they give better ranking to https sites, thats simply not true so far as i have seen

I believe Google says they use HTTPS as yet another signal they use to rank your site, not that they will significantly push up HTTPS sites over other signals they use, like 'quality of content'.

Point 3 makes no sense...