Hacker News new | ask | show | jobs
by teamjimmyy 4709 days ago
I don't get it. Why is this different from inspecting your web logs? Sure you lose the first-party cookie aspect, but I bet you can get awful close just looking at the request IPs. There's "tracking" inherent in how everything works, so why does it matter if collection is contracted to a 3rd Party?

Does the poster expect the web server to not write a log line because he sent a DNT header too?

2 comments

IP logs aren't sufficiently unique: my IP changes as I move my laptop around, it is shared with several other persons at work and home, and my IP at each of these locations changes.

Most DNT is concerned with Javascript, which has the ability to be very intrusive than mere web logs. Analytics services started with web logs, but quickly transitioned to Javascript, because I can track a cookie much better than an IP address, and get more information besides.

It's inherently different when contracted to a 3rd Party.

Third-party vendors are opposed because it would be the equivalent of giving all of the IP logs from a majority of the Internet to a single user (in this case, Google Analytics). The ability to discover trends on particular users than becomes massively possible in a way that simply doesn't exist with 1st Party tracking. The siren's call to monetize this data is ever present, so we seek to not allow the collection in the first place.

I'll say here what I said in the other reply, but briefly.

There's a difference between a 3rd party doing the analytics and a 3rd party cookie. GA can (and should) use a 1st party cookie for this, which would make it impossible for them to correlate between sites. As a bonus, turning off 3rd party cookies also breaks ad retargeting, which makes everything better.

At that point, it's the same as Mozilla doing it themselves, but your concerns about JS being more potentially intrusive is valid.

note: i may be wrong about GA using 1st party cookies. if so, that's really not cool.

GA does use 1st party cookies. There is still concern that with sufficient statistical analysis, Google can still track users across multiple sites. "Anonymous" data frequently turns out to be very personally identifying.

In particular, comparing behaviors and IP addresses used in Google products and captured in Google Analytics would be very easy.

Likewise, Google knows a super-majority of site entrances from their search engine, and a correlation is trivial given that most users are logged in for search. To wit: if I perform a search with a unique referrer, and that unique referrer is then captured with my Google Analytics user cookie, then I can be readily identified as a person. Doubleclick and other Google services share this issue.

Others do use Third Party Cookies. Mozilla is threatening to turn off 3rd Party cookies entirely, which has caused no small amount of concern from ad companies. See this post, one in a series of hilariously over the top diatribes from the Interactive Advertising Bureau: http://www.iab.net/iablog/2013/06/mozilla-kangaroo-cookie-co...

Yeah, I saw the bit about turning off all 3rd party cookies, which made me happy as I already do that myself.

As for the ubiquity and potential for data sharing among Google services, I suppose I hadn't though that entirely through. I know there was one analytics company claiming it could track individuals between devices using some fancy statistics, but I assumed it was snake oil (it was not GA claiming that).

Anyway, I hear ya, and thanks. I can see a case against GA specifically, though I have a hard time swallowing it against all analytics. I suppose it's a question of trade-offs that people are willing to make.

One of the big differences between 1st-party and 3rd-party tracking is that Bob at Bob's Cakes can only see what you're doing on Bob's site (1st-party tracking), but if Bob uses Google Analytics, and so does Jane, and Sarah, then Google Analytics (3rd-party) knows about your activity _across_ Bob's, Jane's, and Sarah's sites, which can potentially be used in worse/more invasive ways.

Also, the javascript tracking scripts can capture a lot more information than a simple access log line - they're not directly comparable.

This isn't strictly true, which is why I made the differentiation above between 1st and 3rd party cookies. With the 1st party cookie you'd get a new GA cookie on each site (e.g. mozilla-GA, ycombinator-GA, etc), making those correlations impossible. In the case of 3rd party cookies, yeah, I totally get that they can be used for some seriously evil things.

It's possible GA could try to correlate IPs or browser fingerprints between 1st party cookies over multiple sites, but proxies and mobile devices would make that difficult. The fact that all the data is together in GA's warehouse doesn't change the fact that the data isn't there to be correlated.

As for JS being able to be more intrusive, sure, I get that. At that point, I suppose you have to trust the site you're on that they wouldn't use a service that was intrusive. Perhaps this is a bridge too far for some, which is reasonable.

I guess I just don't get wanting to ban the tool entirely when it could but is not currently be used nefariously. (working on the assumption that if GA started fingerprinting browsers someone would've seen the traffic by now. it's not easy to hide.)