Hacker News new | ask | show | jobs
by TekMol 2115 days ago
I think their idea is to combine that with signing the bundles, so a page from www.someserver.com can be served by anyone, aka Google. I guess this would mean Google can serve all content on the web.

There seems to be a strong urge in Google to cut the connection between then endpoints of the web and become the central authority. Make all traffic flow through their machines. Let no information arrive at the endpoints.

Right now, requests on the web are kind of p2p. A user requests a website, the publisher serves it any way they see fit. Directly via their servers or via a CDN of their choice.

Google seems to have a strong focus on ending this. Turning the web into Googlebook / AOLoogle.

I wonder why. Do they see their business model threatened on the open web? Or do they see a chance to increase their profit with a closed web?

4 comments

The article seems to give some clue. The format would allow serving unblockable ads using random urls, or urls that look legitimate. Google’s goal is to have full control over the user experience so that they can serve more ads.
Kind of.

It is already be possible today to put ads into a page directly.

But the benefit for Google would be that if they deliver the bundle, they would know that their ads are in there. Heck, they would know everything that is in there. So they would have full information about all ads and everything that is taking place on this new "web".

Of course, they could also use the opportunity to hinder ad blockers further. For example by not allowing plugins to get between reading the bundle and rendering it. They already weakened plugins a lot over the recent years.

> But the benefit for Google would be that if they deliver the bundle, they would know that their ads are in there. Heck, they would know everything that is in there.

Is this a problem for Google currently?

1. They already get loaded/notified for the ads themselves. Do they have a problem with sites claiming they're serving ads and not doing so? (Wouldn't those sites just not get paid?)

2. They'd be serving this in response to web searches. They've already crawled the web page, or at least some version of it. (Yes, it could be a different version, but given the increasing unpopularity of what's now called "server-side rendering" aka the normal thing back in the CGI days, there's no guarantee even with a bundle that the site as seen by a human matches the same site as seen by Googlebot.)

3. If you are running Google Ads or even Google Analytics, you're evaling JavaScript controlled by Google in the context of your web page. They already have access to every detail of what's happening with your site, down to (if they want) where the user's mouse pointer is. What more information would they have access to by seeing the bundle?

> For example by not allowing plugins to get between reading the bundle and rendering it.

Why could they not do this with normal web pages? Define a Content-Security-Policy: no-extension-modifications header and make up some story about protecting high-value sites from buggy extensions....

    They already get loaded/notified
    for the ads themselves
Because the ads are loaded from their servers. Which makes it easy to block them.

A simple text or image in the website would currently not be tracked by Google. But if they deliver it, they can track it.

In order to do that, you'd need a change on the website's server to load and embed a Google ad into the web bundle. But if you could do that, you could just make a change on the website's server to render a Google ad into a normal web page - it's a less complicated change, and it can be done right now without adding a new feature to every browser.

I also don't really follow the attack that Google is supposedly trying to protect itself from. Is it trying to track whether websites that have signed up for Google ads are actually serving those ads to users? (Why would they care about that? If ads aren't being served, websites aren't making money.) Is it trying to track whether ads are showing up on users' screens and not being blocked by an ad-blocker? (Then it doesn't make a difference if Google views a signed web bundle or injects a script to monitor the page, and again, they're already injecting a script.) If a Google script is blocked, isn't the answer to track it as "zero" - i.e., what's the problem with (potentially) more ads being shown than Google knows about?

You could already use random URLs to serve ads and track, not sure how a bundle changes anything
Serving ads or doing tracking from random URLs doesn't work with extensions like uBlock Origin.

In Firefox, uBO can block first party domains: https://github.com/uBlockOrigin/uBlock-issues/issues/780

Also domain or URL blocking isn't the only thing you can do. You can also ban scripts from the page, or monkey patch JavaScript to change behavior that ad networks rely on. And you can also do cosmetic ads blocking, to hide elements from the page, via user stylesheets.

It's no wonder that Google is deprecating the APIs uBlock Origin relies on, in Manifest v3 ;-)

Fact of the matter is Google is now engaged in war against ad-blocking.

Again I don't get it. If a site wants to they can serve ads inline right now, without making client side requests to ad servers and ublock can't do anything about it. And first party cookies don't do any good for tracking so it doesn't matter if they come in the bundle. The ad trackers want to see where you go across multiple sites and there's no simple way to do that w/out 3rd party cookies (although you can use browser fingerprinting)
Browsers like Safari are increasingly enabling privacy features that, among other techniques, block third party cookies by default. So this could be Google’s way of reacting to this change.
I don't see how. Web bundles don't affect whether a cookie is considered third-party or not. Either the content is provided and signed by the actual website, which cannot place cookies for the advertiser's domain, or it's provided and signed by the advertiser, at which point it would be subject to blocking just like normal web traffic from the advertiser. Is this any different with bundles?
The third-party cookie will be blocked, but when both sites are served via the same proxy server, over the same TLS/QUIC connection, the third-party can get similar tracking information they would have had with a cookie, without needing a cookie. It's not exact, but it's good enough for inference.
Assuming that the third party in this scenario is distinct from the party serving the bundle, they wouldn't be involved in the TLS/QUIC connection, right?

So it seems like a third party wouldn't even know that their resource was delivered, unless the party delivering the bundle notified them, or their script makes a separate request to their own server. (And those are options already, so AFAIK bundles wouldn't give third parties any new capabilities.)

How do bundles "help" with third-party cookies?
Normally, if you load a page from Party A which pulls in content from Party B, it's hard for them to correlate who you are because they have separate cookies. That's not a problem if they're served from the same host, probably even on the same QUIC connection.
Disappointing that the author concludes : "As a user, there is little that can be done in this regard other than to watch how this will all unfold in the future."

As with all other power grabs, the ability to resist it is simply a function of how organised the resistance is.

The apathy shown here directly counteracts any urge to resist.

If iOS Safari does not play ball, we are safe.
Part of what excites me is that it decentralizes where assets have to come from. Yes it means Google can serve stuff, which will help some all operators at who knows exactly what cost of privacy.

But what absolutely electrifies me is that I can share content with other people: even in an offline scenario I can give then a webbundle with a site if the site supports it, and the friend's browser can crytographically check everything out, & trust that the bundle is from the bundler.

> Right now, requests on the web are kind of p2p.

Today's web is decentralized, because there are many domains. But there is little peering among peers: everything is client-server.

This, imo, enables a much more p2p web. It enables a distributed web. Where even if an endpoint is under attack, the web can go on. Where folks who fall over the edge (go offline) can still operate. But yes, seems likely Google intends to be a rather large peer among this newly distributed web.

I recommend the IETF draft of use cases for getting a taste of what WebBundles is for, which hints at this new distributed architecture, by way of describing characteristics a WebBundled web has,

https://wicg.github.io/webpackage/draft-yasskin-wpack-use-ca...

We might not disagree much. IDK. I think there's something right about what you saying, but I seek clarification. I ask for your patience in thinking outloud with me (and my schizoposting); I'm someone who isn't as skilled as you are with computers. Despite my ignorance, I am a person deeply concerned with p2p-ness. I'm delighted to see your argument, and I appreciate your perhaps contrarian perspective here.

Google is evil, and if we need to wrestle about that, I will. I'd like to see your red-team skepticism about their intentions and your attempt to consider how this may be a trojan horse or a false-compromise. Google is famous for making moves that look neutral or even good from many angles that are ultimately centralizing power in the hands of capitalists. With good reason, we should doubt why they are doing this. It does appear that the core intuition (if I understand correctly) in WebBundles //can// be used to improve decentralization of information power, but I suggest we should paranoically imagine how it may be exploitable by Google (that is our duty here).

I have some limited experience and a ton of skin in the game on this one. For several years, my wiki has had some of the properties of a prototype of a WebBundle, including an attempt at enabling cryptographic verification (https://philosopher.life/#Cryptographic%20Verification). My goal is to emit one huge all-inclusive html file with the signature wrapped around it (I sign and push/sync up to every minute). This enables me to distribute my wiki across many networks, even sneakernets, without losing one of the fundamental keys to my voice. I'm a second-class citizen on the internet compared to a large corporation, and I have to be able to effortlessly abandon or accept the losses of rented end-points (I really don't own my domain, access-point, or server...they are merely rented: I do own my private key though). In some sense, I have the opportunity to agnostically treat the methods of distribution as a lame middlemen pipeline (what we always hoped the internet infrastructure would really be). I give up my ability to control how my wiki is distributed in some sense as I enable anyone to pass around the signed wiki as a proxy. I happily lose the ability to check whether or not I want to send my signed wiki to any individual in many cases, and I lack interactive control of a session; it feels like I become a far more passive participant of the web, being incentivized to provide the read-only information valuable to ML and disincentived from relying upon dynamic real-time exchanges. I appreciate being able to prevent people from putting words in my mouth while also enabling users of my wiki to acquire and run the site offline, as they see fit, with maximum privacy and anonymity.

There's the context I have. From what I can tell, from a grassroots p2p practice, the reason that the signature "works" is because a user has maintained an old copy of the wiki or even just the public key that they do trust. They've chosen by hand to trust it's me that signed it. I'm not convinced that Google intends to maximize the automation and decentralization value of that kind of verification. It seems an incidental possibility at best (perhaps there's their quasi plausible deniability in seeking a monopoly).

They aim to be more than merely a very large peer, and I'm begging you to question that more openly with me. This feels like a disruptive feint only seeking decentrality in name. Perhaps their move weakens the powers of many web infrastructures that would otherwise continue to centralize, but I think they will continue to attempt to take over whatever power vacuums arise in that space (I assume they can see how to make money off this far better than I can too). When I see, for example, Dat become a first-class citizen of Chrome and when I see them empower client-side archiving, search, and moderation to users of their infrastructure (while taking Firefox and web standards off the leash), I'll begin to believe they intend to enable a p2p web. For now, I see them building an AMPed blackhole walled-garden where they aim to be the root server of trust and authority on what is salient while allowing the highest paying bidders to have degrees of access or control over our data, minds, and lives.

> I'm not convinced that Google intends to maximize the automation and decentralization value of that kind of verification.

This seems like the core question/hypothesis you have as to why you might suspect this particular technology. If you have other specific concerns or fears or misuse, please let me know, but I have not identified anything else I can speak directly to. I dont understand Google nor their incentives, but I do understand the IETF drafts for this technology fairly well.

As to this specific question, the validation for Signed HTTP Exchanges (SXG) is the same validation that happens with any web page you would load via https://. This is not a perfect system, but one we have lived with, & SXG introduces no new complexities to it.

Very interesting comment

I'll try and give my perspective as someone who has spent a couple hundred thousand dollars on Google Adwords and also gets a lot of organic traffic from them, and also does a lot of work on Apple Apps and Android Apps

*

you wrote:

TekMol 22 hours ago [–]

I think their idea is to combine that with signing the bundles, so a page from www.someserver.com can be served by anyone, aka Google. I guess this would mean Google can serve all content on the web.

There seems to be a strong urge in Google to cut the connection between then endpoints of the web and become the central authority. Make all traffic flow through their machines. Let no information arrive at the endpoints.

Right now, requests on the web are kind of p2p. A user requests a website, the publisher serves it any way they see fit. Directly via their servers or via a CDN of their choice.

Google seems to have a strong focus on ending this. Turning the web into Googlebook / AOLoogle.

I wonder why. Do they see their business model threatened on the open web? Or do they see a chance to increase their profit with a closed web?

@#$#$

@#$#$

OK, so think of Google as

THE STARTING POINT that everyone uses for the Internet

There are 3 critical things required

A) Trust

B) Efficiency

C) No other starting points

Now, Google's problem is that it knows (like all technology companies) that things change very fast in technology

look at Facebook having to buy Instagram and then WhatsApp and then having to GovernmentAttack TikTok and not being able to buy Snap chat

Google, on the other hand, has a very serious issue

A) Its main 'starting point competitors' are not 'buyable' or 'governmentAttackable'

It is Amazon for starting point for shopping, Facebook for starting point for 'people who think the Internet is Facebook', and then new competitors like completely different search methodologies and vertical search engines that are not even 'search engines' but take away Google position

B) It has been following a policy of 'shift everything to google properties'

This creates worsening search results

this leads to a loss of trust and efficiency

Efficiency is really hampered because now a typical search engine user is spending 60% of their time avoiding 2nd quality Google products, to find the remaining 40% and then sort through those to find THE BEST OPTION

C) See, the thing is, that to shift everyone on to Google properties, Google is not just throwing lots of Google results in search, it is also hiding the BEST of BREED services or plain stealing their data (like Yelp and Genius)

D) Trust is further eroded with so much spying and anti privacy

*

So Google is in this very unique situation where it has to do EXTREME measures

such as

try and shift everyone to AMP

try and shift everyone to Web Bundles

try and shift everyone to No Tracking Allowed except by Google

Think of someone who had the biggest trade port between two continents. And they make a TON of money

Then other ports started showing up

So what is their option?

buy up all the ports? what if that is not possible? What if peeping tom Facebook is not willing to sell their port?

then Google knows that sooner or later its port will become one of many and goodbye profits. Then they start pretending - only safe way to cross the ocean is on our ships. So EVERYONE can cross only on our ships

Very similar to FB being scared and starting Internet.org. Best way to eliminate competitors - control the ENTIRE internet and you choose who can be shown

by the way Tesla with StarLink and Amazon with Kuiper are also in position to do this (and not sure Tesla but Amazon definitely would)

turn the Internet into a Pay to Play zoo

*

There are lots of other signs too

1) Google click quality is down

2) amount of click fraud is going up

we see clicks coming from Google Servers. on customer service they admit a certain percentage are fake whenever we see fake clicks, they will still charge us and then (on their own) do a token refund

So we see $100 of fake clicks. 12 hours later there is a token $5 refund for fake clicks (they use some other term, will have to check what)

2) amount of organic traffic you get depends on who much you spend

3) If you spend less, then they start showing negative results in organic search to affect sales from people coming to you anyways

Google has already crossed the inflection point. Unless they can magically buy FB and/or Amazon they are basically dead

Just to elaborate on that

They are squeezing every little bit out, even using lots of wrong methods to do that, becoming less and less value

MANY verticals people have switched COMPLETELY to FB and other advertising

Google is still good for many, many areas. However, they are so saturated and so inefficient at giving you bang for the buck, it's crazy

Meanwhile, FB will let you do anything you want to FB users, provided you pay them enough

So for advertisers who don't mind such a set up, FB is 10 times better

* A lot of Google advertising money is INERTIA

It's unfortunate that TikTok is getting ticktocked. Otherwise it would have eaten massively into Google's earnings

Google also has very high costs to remaing 'default' in the web browsers

They're paying Apple $10 billion a year to be default search

Apple should give them a fitting gift for them stealing iPhone ideas and design for Android and build its own search engine. Google market cap would halve within a year if Apple did that