| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Medicineguy 1976 days ago
	The problem is not the option to place cookies per se. The issue is its misuse which aims to de-anonymize users (in order to place ads). I don't see how saving the user data somewhere else (in a browser add-on or in the browser natively) is helping here. EDIT: The official description [https://github.com/WICG/floc], does a better job in explaining the point. They try to cluster (="cohort") users interests and exchange that with the ad-service. This could maybe help to increase transparency and authority over your data as it's saved locally. But I don't see a way to limit the access to the users cohorts (they even say that themself, see link above). Everybody could access my interests - not just Google and other ad services. And of course, if you have 1000 categories and some meta information (region based on IP address etc.), you will be able to track down individual users with pretty good accuracy.

8 comments

74B5 1976 days ago

From the Github page: >Browsers would need a way to form clusters that are both useful and private >The browser uses machine learning algorithms to develop a cohort based on the sites that an individual visits.

To me it sound like just another layer of indirection with google right in the center of it. Even if this method works well enough from an advertising perspective, i expect there will soon be adverserial models to deanonymize.

cestith 1976 days ago

Rather than giving the advertiser a list of my interests, it'd be nice if the advertiser gave me a list of keywords for the ads it might show next and my browser requests the ad for me. A default browser could then be configured to learn with a thumbs up / thumbs down / never show me again type of Bayesian training. Or a non-mainstream browser could request random ads.

doytch 1976 days ago

But most people would never up/down the ad, which means the ad would be targeted more randomly, which means it wouldn't be as effective, which means the website/content owner wouldn't get as much money for displaying it.

I don't think that solution works in the current environment, unfortunately.

bluesign 1976 days ago

If I clicked the ad, thumbs up, if not thumbs down. With appropriate weights this can work.

But.. Ad networks will never implement this, cause priority there :

1) ad network 2) advertiser 3) publisher 4) user

This bumps user from 4th place to 1st place

maccam94 1975 days ago

As of today, some ads pay per click (CPC), but most ad spots pay per 1000 views (CPM). Ads can influence behavior after they are viewed, regardless of whether the user decides to interact with the ad. I'm sure Google has put tons of effort into trying to tie ad views to purchases, both online and offline (I bet GMail and Google Pay are leveraged for this).

I am not claiming this is good or bad, but clicks are not a good enough signal of efficacy for the vast majority of ads shown on the internet.

cestith 1975 days ago

A good ad network's JS should be able to tell how long the ad was in the viewing portal, or for a video interstitial how long it played before the viewer skipped the end. That sort of signal could be useful, especially the way many sites use horizontal ad banners like horizontal rules in the pages.

mike_d 1975 days ago

They would never implement it because there is no valid signal here. The vast majority of users would just thumbs down every ad they see because they believe that will result in less ads.

Go check out the messaging around Ad Choices and how poorly it ended up working.

olliej 1975 days ago

Ad Choices also intentionally obscured the access into by making it small and appear to be ad branding rather than a button.

thaumasiotes 1975 days ago

> The vast majority of users would just thumbs down every ad they see because they believe that will result in less ads.

This... just doesn't apply to the scheme described:

>> If I clicked the ad, thumbs up, if not thumbs down. With appropriate weights this can work.

manigandham 1975 days ago

Ad networks are in business based on ad performance, which is driven by the user. Even though there's quite a lot of terrible UX with online ads (for lots of reasons), the user does matter more than you think.

t0astbread 1975 days ago

If the advertiser is able to try a large amount of keywords they might still be able to infer the client's interest list based on what it requests.

cestith 1975 days ago

That's for sure, and more fringe interests would still be more informative than more mainstream interests, too. It wouldn't be as direct and therefore provides a bit of a barrier.

1vuio0pswjnm7 1975 days ago

There is another issue of potential/actual misuse that few are discussing.

That is, the (mis)use of experiments on browser users. "Field trials." This is enabled by the use of "updates". When users agree to updates they agree to let a corporation silently install and run new code on their computer at will, at any time.

This permits the company to create a situation where person A's browser is not quite the same program as person B's, there will be differences. Thus the corporation can run an "experiment". Both person A and person B might believe "I am using XYZ browser". The two users believe it is the same program. However there are differences. The differences can be added and removed through "automatic updates".

How do users maintain privacy in that situtation. The company behind XYZ browser can easily isolate groups of users with similar/different traits by conducting such experiments and observing user behaviour. "Cohorts". While the company may argue it is only testing software, there is an argument that it can also be testing users.

The words in parentheticals above can be defined and redfined any way you like. What is important is what the corporation is actually doing, not the label/name/terminology they assign to it.

iamdamian 1975 days ago

I think what you’re saying is “despite this potential change, A/B testing by user attributes will still exist”. Is that right?

cpeterso 1976 days ago

Plus, if each cohort is a “group of [merely] thousands of people [any the worldwide internet population]”, the advertiser could probably narrow your identity pretty well using passive fingerprinting of cohort(s) + IP address + Chrome version + OS + OS version and maybe HTTP headers for languages locale and time zone, though those are probably strongly correlated with the client IP address.

jefftk 1975 days ago

Combining those would definitely be a problem. https://www.chromium.org/Home/chromium-privacy/privacy-sandb... describes removing/limiting those fingerprinting vectors, including IP.

(Disclosure: I work for Google, speaking only for myself.)

fumar 1975 days ago

> Browsers would need a way to form clusters that are both useful and private >The browser uses machine learning algorithms to develop a cohort based on the sites that an individual visits.

How would FLoC audience targeting work in non-chrome browsers? DV360 users deliver ads on all browsers, no?

jefftk 1975 days ago

FLoC is a proposal for a web standard, which other browsers could implement.

Today, in browsers where third party cookies were removed without replacement, companies like Google that aren't willing to fingerprint have pretty limited user targeting capabilities.

fumar 1975 days ago

Does that mean advertisers using DV360 will have the option to target using known identifiers or FloC? Chrome market share in the US is 50%. FloC covers 50% of the total US market. Advertisers want all the scale. https://www.statista.com/statistics/276738/worldwide-and-us-...

dennisy 1975 days ago

I think users using the search engine, email, maps etc in other browsers is hardly a "limited" amount of data for ad targeting.

jefftk 1975 days ago

Sorry, you're right, advertising on Google's own properties is mostly unaffected by browsers removing support for third-party cookies. I was thinking about AdManager and AdSense; ads shown on publisher sites.

jahewson 1976 days ago

According to the specs, the requests are made without user agent headers, leaving only IP address. Targeting ads based on IP address isn't particularly valuable to ad networks if they can't correlate it with anything other than the sandboxed cohort data.

mike_d 1975 days ago

If you give me a demographic group (age, sex, income, etc) of a thousand people, and give me the IP address I can uniquely identify the individual within that group using outside data sources like Experian.

jefftk 1975 days ago

> and give me the IP address

The Chrome proposal is that it won't: https://github.com/bslassey/ip-blindness

mike_d 1975 days ago

What insane ramblings is this? Every site will be forced to use an approved CDN? Adding forced MitM to every connection is the opposite of what we should be trying to implement.

jefftk 1975 days ago

If you want to prevent fingerprinting, you need to look at where the identifying bits are coming from. (ex: https://coveryourtracks.eff.org/) The IP address provides enough bits to uniquely identify many users, and when combined with just a few more bits, to identify almost anyone.

TOR is one solution here, which you could potentially also describe as "adding forced MitM to every connection". The proposals in https://github.com/bslassey/ip-blindness/blob/master/near_pa... and https://github.com/bslassey/ip-blindness/blob/master/willful... have different tradeoffs than TOR, with the "TOR is painfully slow" problem being a big one.

If you have better ideas, though, I would be very interested in reading them!

jahewson 1976 days ago

> if you have 1000 categories and some meta information (region based on IP address etc.), you will be able to track down individual users with pretty good accuracy.

Looking at the corresponding TURTLEDOVE proposal, it's sending only a handful of the known categories to any given ad network at any given time. Floc also claims that:

> The collection of cohorts will be analyzed to ensure that cohorts are of sufficient size

btown 1975 days ago

Browser fingerprinting is already pretty good if you can run arbitrary JS on a site. Add access to a FLOC, even a FLOC with 10k people, and you're basically at a place that's worse than third-party cookies were, because at least third-party cookies could be blocked. Ad networks are already using fingerprinting and this will be seen as a blessing to them.

alkonaut 1975 days ago

If browsers would stop some edge case extensions such as rendering to canvas and reading the data back, it would be much more difficult. Browser JS envs just expose way way too much entropy from the user system

dillondoyle 1975 days ago

You'd have to get rid of a ton of modern features and somehow backfill / update all browsers to a set of constants

- audio waveform generation - access to gpu/webgl info - have to somehow dramatically change or remove ICE/webrtc - standardize 'feature flags' e.g. somehow backfill old browser so they all show support for new JS objects - access to only a small set of fonts - somehow make rendering completely the same across browsers or remove measurement/rendering to like 5px increments or something. e.g. bounding rect of (747744.888some two character specific font or some svgcss transform etc) - testing for a ton of css extensions - supported mime types - a bunch of SVG things (i dont think this has been explored much i have a hunch there are some good targets) - a bunch of latency hacks and more...

alkonaut 1975 days ago

Things like string measurement is indeed tricky. Audio generation or reading back raster data simply shouldn’t be possible by default. I’d be happy to enable that on a per site basis like pop ups.

ssss11 1975 days ago

Its a classic battle of intent and misdirection to the tools. The problem isn’t the tools it’s the intent.

8bitsrule 1975 days ago

When cookies first appeared, my first response to someone pushing them was: you want to save data for your purposes? Save it on your own damned machine, I don't want it on mine. Of course they're 'abused', that was the whole intent.