Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 1608 days ago
"If you're only hosting 1 site, isn't the privacy leak negligible because there is a 1:1 mapping from ip to domain so an attacker can easily determine it."

What if it is a company serving advertisers, not an "attacker".

Not sure why this myth of effortless, reliable translation from IP to domain name in "real-time" exists amongst HN commenters. Show us who is doing this for the purposes of advertising and how it is worth the effort and can be relied on. Even if this were possible, it still does not justify sending a plaintext hostname over the wire when it is not necessary to retrieve a page.^2 Not every site requires SNI (yet "modern" browsers send it anyway).

Tell us how to reliably^1 translate any IP to a domainname at the same speed as one can sniff SNI, and with no extra effort. Assume the use of TLS1.3 so one cannot simply examine a plaintext certificate sent over the wire for a CN or SAN. Then show us where and how this is routinely being done by various companies selling online ads or ad services.

1. PTR will not suffice

2. This is like arguing that because it is theoretically possible for a third party examining traffic to discover some private information through a process that requires referencing additional sources, the user should therefore broadcast the information with every request, despite that it serves no purpose for the user to do so.

Sniffing SNI is common practice.^3 Performing effortless 1:1 mapping from IP address to domain name for the entire www in real-time, for advertising purposes, is not. As long as browsers send SNI with every request, there is no need to do that, even if it were possible.

3. Encrypted Client Hello (ECH) will eventually prevent it. See https://defo.ie

There is no harm in Gemini making SNI optional instead of mandatory.

1 comments

> Show us who is doing this for the purposes of advertising and how it is worth the effort and can be relied on.

Why would an advertiser do this? Advertisers are typically in leauge with site operators. Site operators just tell them this data (maybe with rare exceptions like superphish). Advertisers don't do this because they don't need to.

Advertisers are not the adversary tls is meant to thrawt. You don't use the lock on your door to thrawt the person who you invited in and opened the door for.

> Not every site requires SNI (yet "modern" browsers send it anyway).

Its difficult to tell if a site needs it or not at the stage where you send it. The current solution seems to be custom dns records (e.g. what ECH is doing last time i looked)

> Tell us how to reliably^1 translate any IP to a domainname at the same speed as one can sniff SNI, and with no extra effort.

The traditional answer is to just sniff the dns traffic.

Otherwise (e.g. if using DoH) just create a db of popular sites you care about. This is not trivial, but still quite easy and the easiest part of the attack being discussed by far.

"Why would an advertiser do this?"

Not an advertiser necessarily but any entity or person that can "monetise" the data collected. The collector might use the data itself, it might license, sell or transfer the data, it might provide services that rely on the data, who knows. Some users may not want to voluntarily share this data when they derive no benefit from doing so. We do not have to guess all the possible ways, besides locating the applicable TLS certificate, that the data might be used before we can honor the user's wish that this data not be sent in plaintext where it is not needed for choosing the certificate.

AFAIK, sniffing SNI is already used for the purpose of censorship by some countries. This has been published. It would be ignorant to think that this is the only purpose for which such data might be used, or that any purpose would always be non-commercial and unconnected, directly or indirectly, to web advertising. As the use of DoH increases, sniffing SNI would seem an easy substitute for sniffing DNS.

1. A real-time list of every domain visited by a user.

"You don't use the lock on your door to thrawt the person who you invited in and opened the door for."

For some users, advertisers are not an "invited person". What is more, companies like Google have attempted to force the use of TLS for every site, even ones where, in the user's or site operator's opinion, TLS is not needed.

"It is difficult to tell if a site needs it or not at the stage where you send it."

But this is not an argument for sending SNI by default, even where it is not needed.

A TLS proxy can be configured to distinguish sites that need it from sites that do not. This is what I do. The default configuration is to not send SNI. This makes sense because the majority of sites I visit do not require it.

As such, from where I sit, the solution chosen by modern web browsers is to prioritise websites that use CDNs that depend on SNI. The side effects for users of indiscriminantly sending SNI, i.e., sharing every domain the user visits in plaintext on the wire, are not as important as reducing costs for those websites using TLS and CDNs. Arguably, SNI is for the benefit of websites and CDNs at the expense of users. (Hopefully ECH will obviate this tradeoff.)

"Otherwise (e.g. if using DoH) just create a db of popular sites you care about."

According to this answer, 1:1 mapping is not an equally easy alternative to SNI. Sniffing SNI is "trivial" and works for any https site, whereas 1:1 mapping through a database is "non-trivial" and only works for "a selection of popular sites [one] cares about". SNI makes the task of monitoring a user's web use easy. If SNI is not available to sniff, then the task becomes more difficult. This is the point.

Sniffing SNI is easy. The theoretical 1:1 mapping alternative proposed by HN commenters is more difficult. This is the point. What is easy and reliable for all www sites versus what is more difficult and unreliable for all www sites. The point is not what is possible^2 and what is impossible. That is the red herring diversionary argument tactic that HN commenters defending gratuitous SNI like to use.

2. It is possible to avoid DNS altogether and to only send SNI when it is required. I have been doing this for years. Gather bulk DNS data and load the data into a forward TLS proxy that stores domain:IP addresses mapppings in memory and does lookups in real-time as requests are received.