Hacker News new | ask | show | jobs
by lisper 1831 days ago
Wow, the bias in this article is unbelievably blatant:

"[Amazon is] preventing Google’s tracking system FLoC — or Federated Learning of Cohorts — from gathering valuable data reflecting the products people research in Amazon’s vast e-commerce universe"

Compare with, e.g.:

"Amazon is taking steps to protect its user's privacy by blocking Google's heavy-handed overreach in leveraging its Chrome browser to spy on user's personal shopping habits and sell that information to other retailers".

(Note: I'm not saying my rewrite is unbiased. It's not. It's just biased in a different direction to highlight the contrast.)

2 comments

Yeahhhh, but Amazon makes a ton off their own ad business and is trying to turn everyone's personal devices into a mesh network they own. They don't give af about user privacy.
> They don't give af about user privacy.

That part seems to be the only universal truth these days.

IMO these two things are compatible. Their mesh network is incredibly gross but it's not a privacy violation, it's bad in other ways.
It's almost guaranteed to be a privacy violation unless you think Amazon can write complicated yet bug-free networking code.
I’m not sure about the privacy part, but they do have very good success with AWS, which I’m sure includes loads and loads of network code.
Amazon has some top notch mesh engineers, I know this personally. I highly doubt their talent is being used on this mesh effort, sadly.
They very much do. When is the last time you heard about any private data leak from Amazon?
So your suggesting that add long as our personal information is in their hands and is utilised for maximizing profits but hasn't leaked we shouldn't worry about privacy?
Security != Privacy
That suggests that they are keeping it safe, not that they are not storing or using it.
Isn't FLoC on-device? So 'gathering valuable data' would be users' own devices doing so, right?
It's pretty complicated and my understanding could be wrong and definitely not an expert. All the stupid CIA-style names that keep changing don't help. Turtledove, fledge, sparrow lol.

But from what I think I know that's kind of right technically, but kind of not in terms of actual real privacy.

Yes, the actual browsing data, e.g. for the basic floc cohorts only what amazon product page you visited, is no longer 'sent' to ad networks (that's a pretty big oversimplification of how ad networks track you but for brevity). That data is parsed in your browser to generate a cohort ID for you.

But this cohort ID is exposed to the world document.interestCohort() and is what's used for targeting and tracking.

To me it seems that the cohorts are so small "thousands of people" + IP or UA it's basically the same as a semi-long lasting uuid.

And if you have like even 10 different cohort IDs, even if some of them are 'fake'/'noise' that's probably enough to ID you alone

Here's an image from google's site.

https://web-dev.imgix.net/image/80mq7dk16vVEg8BBhsVe42n6zn82...

It also seems like Chrome/google might be still defaulting browser settings to give themselves even more data just like they currently do?

https://github.com/WICG/floc#qualifying-users-for-whom-a-coh...

BUT when you layer on the other proposals (Fledge/Turtledove/Dovekey or whatever) - which I don't understand that much maybe someone else can explain - it seems like it basically collect this page/product level data and makes it available to DSP etc for tracking/ad serving (again if not technically 1:1 basically in consequence given the sizes of these groups).

Like one of the proposals talks about a 'trusted' key/value server which doesn't seem that different from what already happens? The original proposal wanted to move the entire ad bid/target/serve process into the browser.

The point of FLOC is that you are only ever part of one. There's no combining the different cohorts that a user is in to be done, because there is only ever one for each user. Now, there is some legitimate discussion on his to handle changes to cohorts, since simply changing the users cohort ID in response to a user changing their browsing interests leaves the user open to such a set intersection attack. Some people have suggested options such as freezing the ID for the lifetime of the site's state to prevent it.

FLEDGE/Turtle*/etc. is a different issue. I'm not sure it will be more private than 3rd party cookies since the spec is not very clear and it has so many moving parts. I have heard from some Chrome devs that if it doesn't end up better for privacy than 3rd party cookies, it won't get past the origin trial stage.

Ah that makes a bit more sense thank you for that info.

The docs/images they use make it look like an array but I just read the origin trial info page and it says ocument.interestCohort() only returns cluster id and algo version id.

still though the point stands i think. even say 1 million people in one cohort id # (they use 'thousands' to describe) + ip + UA and it's pretty unique, until apple and others proxying everything as recent posts suggest. Add whatever 8 bits or however many privacy allowance entropy and it's probably very unique and trackable over time if you have say TTD scale.

totally! it's very very confusing and I don't understand some (ok maybe a lot lol) of the RTB/context/retarget proposals and multiple RTB stakeholders have submitted their own too and they all have really stupid confusing names. But that's what I gather that it's basically the same result. It feels like the only way to do similar retargeting, conversion tracking is to have one 'trusted' source who gets all the data

Does it matter whether the code Google wrote to do it executes on your device or on their servers? In the end they try to group people based on their Amazon browsing behavior and Amazon doesn't want that. Nor should any sane user want that, and Google knows that that's why it's opt-out instead of opt-in.

Thank god they figured out it is illegal in Europe to do this without opt-in and didn't roll out FLoC here...