Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 1609 days ago
"Why couldn't one simply build on top of existing HTTP infrastructure, throw away all the baggage and instead implement a new Content-Type, which existing browsers then could parse?"

Because escaping the annoyances of the "modern web" may require escaping from the "modern browser". The changes users want to "modern browsers" will never be made. The vendors of these programs do not answer to users. They answer to web developers and advertisers. These are large, complex, insecure programs usually controlled by organisations that seek to profit from online advertising. The advertising focus leads to complex web pages. Not the type of simpler pages that some users want. (NB. Not "most", but "some".) Gemini, because of its limitations, allows users to retrieve resources without the need for one of these "modern browser" programs. If a new Content-Type was added to HTTP, what is the likelihood that other parties outside the "modern web browser" cabal would write small, simple, alternative browsers, e.g., aimed only at this Content-Type. Look at the market share, i.e., available selection, of web browsers. It is not diverse.

Whereas, writing a Gemini client is dead simple. A Gemini browser cabal where users have only a few choices and they are each controlled by corporations is unlikely.

Asking web developers to "please make simpler web pages, thanks", when the "modern browser" allows for complex pages and integration of advertising is not a succesful course of action. Most of these web developers answer to advertisers or to employers who answer to advertisers. They are not going to ditch the user annoyances, they are going to seek profits. Gemini seems to address this problem by making advertising difficult. Without the "modern browser", the possibilities for advertising are limited.

Similarly, asking web users to "please use a text-only browser", e.g., Links, when so many web pages try to use "modern" browser features that enable complex web pages is probably not a successful strategy for many users either. As a long-time Links user, that strategy has worked for me, though.

The only complaint I have about Gemini is the absolute requirement for SNI. Not every IP addresss will necessarily be hosting multiple Gemini sites. Under the current protocol, even addresses hosting only a single site must require SNI. That makes no sense. It serves no purpose. It should be optional not mandatory.

2 comments

>A Gemini browser cabal where users have only a few choices and they are each controlled by corporations is unlikely.

This is the key point. It is like the idea that you don't need the fastest person to escape from a chasing bear, you only need to be faster than the slowest person.

A user on the conventional web, is the slowest person here. And thus the success of Gemini browser is dependent on the existence of the slower person, that is, a large number of users available for exploitation, on the regular web, so that the business will leave the Gemini users (and the likes) alone.

HN actually implements this idea by remaining minimal. But even this place got overrun by shills these days, which means such piecemeal strategies will not work. So it makes perfect sense to do all the way and use a different protocol altogether..

Many 'rules' (like one request -> one file) create limitations in Gemini that prevent it from being capable of web-style corporate behaviour, so even if business isn't content with regular web users and doesn't leave Gemini users alone, their options and the damage they can do is hampered (unless co-opting the entire movement).

I think the protocol being designed around ensuring it's difficult for sites to exploit visitors is a core goal that gets lost in the blinding document minimalism that's hits you when first encountering Gemini, OP didn't seem to get it.

> if business isn't content with regular web users and doesn't leave Gemini users alone, their options and the damage they can do is hampered

I am afraid you are limiting the scope of "exploitation", as enabled by Internet, to those things done by Ad companies like Google family.

But I was mostly referring to the use of Internet for all kinds of exploitation. Think of a seller on amazon putting fake reviews in the web, or deleting actual negative reviews. They won't bother with the Gemini users because there is enough victims on the real web. But if all users are using Gemini clients, then they will be writing fake reviews in Gemini domain.

That is smaller it remains, the less penetration by big business and selfish interests, and thus the more valuable it is, for the person who wants to find real, genuine information.

This also applies to real discussion. People are going to speak freely in Gemini, because there is less chance of encountering an army of shills.

Keep an eye on sites that sell gemini identities with "reputation". Once those start popping up, you ll know its time to move on to something different.

> Not the type of simpler pages that some users want.

And yet most of these users don't use lynx...

> Only complaint I have about Gemini is the absolute requirement for SNI. Not every IP addresss will necessarily be hosting multiple Gemini sites. Under the current protocol, even addresses hosting only a single site must require SNI. That makes no sense.

If you're only hosting 1 site, isn't the privacy leak negligible because there is 1:1 mapping from ip to domain so an attacker caneasily determine it.

Besides, in a system playing fast and loose with pki, its more like a do not disturb sign than an actual lock.

"If you're only hosting 1 site, isn't the privacy leak negligible because there is a 1:1 mapping from ip to domain so an attacker can easily determine it."

What if it is a company serving advertisers, not an "attacker".

Not sure why this myth of effortless, reliable translation from IP to domain name in "real-time" exists amongst HN commenters. Show us who is doing this for the purposes of advertising and how it is worth the effort and can be relied on. Even if this were possible, it still does not justify sending a plaintext hostname over the wire when it is not necessary to retrieve a page.^2 Not every site requires SNI (yet "modern" browsers send it anyway).

Tell us how to reliably^1 translate any IP to a domainname at the same speed as one can sniff SNI, and with no extra effort. Assume the use of TLS1.3 so one cannot simply examine a plaintext certificate sent over the wire for a CN or SAN. Then show us where and how this is routinely being done by various companies selling online ads or ad services.

1. PTR will not suffice

2. This is like arguing that because it is theoretically possible for a third party examining traffic to discover some private information through a process that requires referencing additional sources, the user should therefore broadcast the information with every request, despite that it serves no purpose for the user to do so.

Sniffing SNI is common practice.^3 Performing effortless 1:1 mapping from IP address to domain name for the entire www in real-time, for advertising purposes, is not. As long as browsers send SNI with every request, there is no need to do that, even if it were possible.

3. Encrypted Client Hello (ECH) will eventually prevent it. See https://defo.ie

There is no harm in Gemini making SNI optional instead of mandatory.

> Show us who is doing this for the purposes of advertising and how it is worth the effort and can be relied on.

Why would an advertiser do this? Advertisers are typically in leauge with site operators. Site operators just tell them this data (maybe with rare exceptions like superphish). Advertisers don't do this because they don't need to.

Advertisers are not the adversary tls is meant to thrawt. You don't use the lock on your door to thrawt the person who you invited in and opened the door for.

> Not every site requires SNI (yet "modern" browsers send it anyway).

Its difficult to tell if a site needs it or not at the stage where you send it. The current solution seems to be custom dns records (e.g. what ECH is doing last time i looked)

> Tell us how to reliably^1 translate any IP to a domainname at the same speed as one can sniff SNI, and with no extra effort.

The traditional answer is to just sniff the dns traffic.

Otherwise (e.g. if using DoH) just create a db of popular sites you care about. This is not trivial, but still quite easy and the easiest part of the attack being discussed by far.

"Why would an advertiser do this?"

Not an advertiser necessarily but any entity or person that can "monetise" the data collected. The collector might use the data itself, it might license, sell or transfer the data, it might provide services that rely on the data, who knows. Some users may not want to voluntarily share this data when they derive no benefit from doing so. We do not have to guess all the possible ways, besides locating the applicable TLS certificate, that the data might be used before we can honor the user's wish that this data not be sent in plaintext where it is not needed for choosing the certificate.

AFAIK, sniffing SNI is already used for the purpose of censorship by some countries. This has been published. It would be ignorant to think that this is the only purpose for which such data might be used, or that any purpose would always be non-commercial and unconnected, directly or indirectly, to web advertising. As the use of DoH increases, sniffing SNI would seem an easy substitute for sniffing DNS.

1. A real-time list of every domain visited by a user.

"You don't use the lock on your door to thrawt the person who you invited in and opened the door for."

For some users, advertisers are not an "invited person". What is more, companies like Google have attempted to force the use of TLS for every site, even ones where, in the user's or site operator's opinion, TLS is not needed.

"It is difficult to tell if a site needs it or not at the stage where you send it."

But this is not an argument for sending SNI by default, even where it is not needed.

A TLS proxy can be configured to distinguish sites that need it from sites that do not. This is what I do. The default configuration is to not send SNI. This makes sense because the majority of sites I visit do not require it.

As such, from where I sit, the solution chosen by modern web browsers is to prioritise websites that use CDNs that depend on SNI. The side effects for users of indiscriminantly sending SNI, i.e., sharing every domain the user visits in plaintext on the wire, are not as important as reducing costs for those websites using TLS and CDNs. Arguably, SNI is for the benefit of websites and CDNs at the expense of users. (Hopefully ECH will obviate this tradeoff.)

"Otherwise (e.g. if using DoH) just create a db of popular sites you care about."

According to this answer, 1:1 mapping is not an equally easy alternative to SNI. Sniffing SNI is "trivial" and works for any https site, whereas 1:1 mapping through a database is "non-trivial" and only works for "a selection of popular sites [one] cares about". SNI makes the task of monitoring a user's web use easy. If SNI is not available to sniff, then the task becomes more difficult. This is the point.

Sniffing SNI is easy. The theoretical 1:1 mapping alternative proposed by HN commenters is more difficult. This is the point. What is easy and reliable for all www sites versus what is more difficult and unreliable for all www sites. The point is not what is possible^2 and what is impossible. That is the red herring diversionary argument tactic that HN commenters defending gratuitous SNI like to use.

2. It is possible to avoid DNS altogether and to only send SNI when it is required. I have been doing this for years. Gather bulk DNS data and load the data into a forward TLS proxy that stores domain:IP addresses mapppings in memory and does lookups in real-time as requests are received.