Hacker News new | ask | show | jobs
by leni536 3820 days ago
Well, I block GoogleAnalytics with uBlock and uMatrix.
1 comments

Unfortunately people have started putting the tracking server side.
Tracking server-side is fine, it is assumed the server logs contain a record of my visits and I have no problem with that. I think what most people object to is the third-party tracking that so many people use.

Company A tracking my visits to Company A's website = OK

Company A using Google Analytics to track my visits (while also enabling Google to track me across multiple sites) = Not OK

EDIT: (replying here as we've reached max comment depth) - I was unaware that it is possible to use Google Analytics server-side only (is this true?) but I hope my original point is still clear, DIY tracking is fine.

I agree with your position. However...

You misunderstand. There have been several commentators here on HN saying that they are moving Google Analytics server side. They seem to think that people are only objecting to the cookie or the presence of the JS rather than objecting to the pervasive cross-site tracking.

In that case, f that and f them. Do not track me.
Are you are aware that Google provide you with a method to do this regardless that doesn't rely on random script blocking? Details here https://tools.google.com/dlpage/gaoptout
Why would I trust google with this ? I'd rather do it myself.
Just a little HN tip. If you click the time of the comment you can reply to it even if there is no direct reply link.
Yes. GA has server-side API's available to premium accounts.

You can also just host the ga.js file yourself. Or run a reverse proxy or any of a dozen other methods to collect data and pass it to GA. Using the standard 3rd party tag is just for convenience.

FYI, server-side APIs don't require premium accounts. :)

I'm pretty sure it's the same mechanism used for mobile/app non-browser tracking.

how can GA correlate between site then ? The server-side does not have access to my GA cookie. Browser fingerprinting ?
Yes, cookies are outdated and just a fallback. Also, unless you never go to a google-owned domain name, you'll be cookied regardless.
You mean GoogleAnalytics-tracking on the server side? Please expand on that, I'm not very versed in all that marketing spy-modules. Do you mean that some internet-shop (or blog or whatever) makes a request to GA or some similar service to share that I was at their website? If so, what information do they share? My IP, cookies or what? I always assumed that very point of GA was outsourcing tracking users to some other service (Google) which could try somehow guess who I am based on flash-cookies and me appearing on other websites with GA. But how would that work server-side?
Even just your UA string is enough in most cases to make educated guesses. See here: https://www.eff.org/deeplinks/2010/01/tracking-by-user-agent... . The server will get that UA string, and it can make subsequent calls (or serve you content that will automatically make calls, like hidden <img> tags...) to further restrict the search space. You can have middleware that does this transparently.

I'm not in that particular market, but I know people who are and tbh more often than not I think it's an arms race the individual simply cannot win. Unless there's a conscious effort from browser-makers to actively counter tracking practices, you should assume everything you do on the web is public and can be tracked by multiple parties.

But the question is if GA and others actually accept these kind of requests: remember that someone with such and such UA (or IP, or whatever) has visited that website? And if people actually use it? I still have my doubts that tracking someone by UA is possible — there will be collisions for the large part of the market — but that some analytics service is actually doing it? It's easy to track me if Google can "reach" to the client side when visiting some website: they can use cookies, all HTTP-request data, even flash-cookies maybe. It's a no brainer to track individual with information like this. But guessing who is who just by UA? This doesn't seem that trivial, so I wonder if they really do that.
Not just the UA, but there are ways : https://panopticlick.eff.org/
Yes. It's all just data in the end. Javascript can handle collecting all the information outside of setting cookies. But cookies are outdated and just a fallback now so all you need is the javascript to run.

This can be as simple as hosting a copy of GA.js yourself but there are plenty of options like using the server-side API if you have GA enterprise or just using a reverse-proxy like Nginx with some rewriting logic.

3rd-party only means it's a different domain (with security usually implemented at the browser level) - it's not some magical wall of isolation.

Oh well. Indeed, I skimmed through their server-side tracking API: https://developers.google.com/analytics/devguides/collection...

That's unsettling.

Everytime you access a website a server is serving you files. Apache (and most web servers) keep logs of this. With Apache defaults you get IP address, the route accessed, and the User-Agent of the user. This is rudimentary information, but if you have these logs from multiple sites, it's pretty easy to roughly track someone. Tracking images in emails use this same principle, a unique link to krick.png is put in an email sent to you, and if it gets served by the server (shows up in the access logs) it's pretty reasonable to assume that you read the email.

If you want to see a simplified version of what this log looks like, run 'python -m SimpleHTTPServer' and visit localhost:8000.

Have you even read my message? Or the thread you are answering to for that matter? The question is not how website owner knows I visited his website, that much is pretty obvious, but if it is the case that server-side tracking somehow allows to use GoogleAnalytics as well (that is, to notify Google from server side who has visited their website) and if this is the case — how does it exactly work. Because that's what JupiterMoon seems to be claiming.
Sure that's possible, http://stackoverflow.com/questions/9503329/is-there-any-way-...

Ofcourse people can(and do) sell their server logs to 3. parties anyway...

Are we agreed that my claim was valid? Thank you for digging up a primary source on it btw!
Yes fine I know this. My objection is that this data is starting to be compiled on a cross site basis.
That's fine by me. I can't control what one does on server side.