Hacker News new | ask | show | jobs
by danhilltech 1574 days ago
Server-side tracking has been around for a while (indeed this article is dated Nov 15, 2020; and of course, you could argue simply parsing your Apache/nginx logs to get visitor stats has existed forever). The article I think conflates several different pieces.

There's probably a few actual use cases marketers may care about for tagging/tracking/analytics:

1. Simplest: I want to know how many people use my site/app, how many come back, how many are real (not bots), which pages are popular, etc. I'd like to see all this in a nice UI where I can cut and filter the data.

2. Same as #1, but I'd like to do it across devices. Still all within my own site/app, but simply connecting a non-logged in session across desktop and mobile web. Google and FB probably have the largest available dataset on this.

3. I'd like to enrich all this information with data from other sources, for example to target ads, serve ads, etc.

Site owners/marketers then try and tackle these in a few ways, the first 3 equally bad:

1. Just dump a bunch of scripts into your site (GA, FB, Segment, whatever). Pros: easy. Cons: very easily blocked, so your data is super biased.

2. Self host some of these scripts, or CNAME them. Pros: maybe a bit better for performance? Cons: still rather easily blocked with content signatures etc. A nightmare to ensure consistency if self-hosting.

3. Run your own JS that sends events to your server, and then your server fans out to whomever. Pros: much harder to block, and likely quite performant. Cons: its unlikely your self built lib is going to give all the same 'features' as GA (features meaning device fingerprinting and so on).

4. Just get everything from HTTP logs. Pros: very performant, can't be blocked. Cons: much more limited data to work with.

Personally, I think #4 is the future (and also where we started 20 years ago). What I don't think anyone is doing yet is relaying that data out to all the other parts of the stack: GA, FB, Mixpanel, whatever. If you could solve both - giving users privacy and performance and giving marketers the same tools they're used to - sounds like a win. You might argue "well we'd be missing a bunch of user data", but you're already missing it with adblockers and iOS privacy features.

4 comments

> 3. Run your own JS that sends events to your server

If your platform is popular enough, those telemetry endpoints will end-up on ad-blockers lists.

Then it is up to you, if you want to do an arms race of obfuscation or just accept it.

totally
1) can be done trivially with first party cookies.

2) you can already tell what device someone is using. If you mean “I want to know if the same person is on different devices” get them to login, don’t try in effectively spy while also providing google etc with the ability to actually spy

3)you cannot know how to target ads on a per user basis unless you are spying on your users. You have no justification that supports a claim to such information.

Yea, I think we're saying the same thing. Ultimately both the best choice (for privacy, performance etc.) and the one that's most likely (given adblockers and and ever increasing push for privacy from browsers and OSs) is to stop trying to find a way around adblockers, and simply invest in the technologies that work - http, cookies, sessions, logins, and os on.
I think some of the whiplash in the market isn't just the tit for tat battle with ad blockers and regulators but the realization that there's so much useless data being collected. The best data we get is first party (ie things people click or type into forms on our sites) or qualitative feedback from surveys. GA and GTM are valuable tools for us but Google's network isn't really.
Yea. Though, GA does (at least) two things: analyzes your own data, and, uses the data they collect from all their other sites to improve your experience via better bot detection, recommendations, insights. Google's network is useful, like it or not, for a) their cross device graph - they know which mobile devices and which desktop browsers are the same user (ish) and b) from that, building better MTA models than you can with pure first-party data - especially if most of your traffic isn't logged in.

But I agree, the future is pointing toward a world where privacy and empowerment is more in the hands of the user, and that's a good thing.