Hacker News new | ask | show | jobs
by aidanlister 1516 days ago
That’ll give you visits, not uniques.
5 comments

You could just do "Set-Cookie: visited=true; Max-Age=<interval>". No unique id, but you still can count uniques by checking requests for the lack of that cookie. This cookie is not personal information, and cannot be used to identify a person, not even indirectly, and thus needs no consent. This is basically what most those "cookie banners" do anyway, set a preferences cookie - that cannot be linked back to a person, if done properly.

Or if you want to avoid the cookie altogether, you could use some static, cachable resource with a cache expiration date. Basically the good old counting pixel. Almost the same as the non-identifying cookie, except caches are more likely to be automatically evicted by browsers.

The only thing that matters about cookies is whether they are necessary, not whether they contain identifying information. Even duration doesn't matter. They should be explained to the user, but consent is not necessary.

Some cookies are even mentioned specifically as allowed. The example given is keeping track of a shopping cart across visits. Do that, and you have your uniques. While hinted at, it does not specifically mention those have to be session cookies: you could have a banner with "accept cookies", then use session cookies whether or not accept is pressed. It even seems to be common practice to hide explanations behind a "more info" button.

https://www.privacypolicies.com/blog/eu-cookies-directive/

I'm pretty sure "uniques" stats don't require you to violate the EU cookie directive.

>The only thing that matters about cookies is whether they are necessary, not whether they contain identifying information.

Incorrect, kinda.

The GDPR concerns personal information, and information that can identify people directly (e.g. location data) or indirectly (e.g. an "opaque" unique id, as it can be potentially linked back to a person, or an IP address, as it can be potentially linked back to a person, with the help of a court order compelling an ISP to pass through subscriber information to a complainant or law enforcement, and that subscriber may live alone).[0] The GDPR does not concern itself with stuff that cannot be used to identify a person or is personal data.

The earlier ePrivacy Directive (better known as the "cookie law", although the section concerning "cookies" is only a small part, and does not even mention cookies explicitly) is a vague thing, on the other hand.

Specifically, it says under "Art 5 - Confidentially of communications" that

"Member States shall ensure that the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent, having been provided with clear and comprehensive information, in accordance with Directive 95/46/EC, inter alia, about the purposes of the processing. This shall not prevent any technical storage or access for the sole purpose of carrying out the transmission of a communication over an electronic communications network, or as strictly necessary in order for the provider of an information society service explicitly requested by the subscriber or user to provide the service."

Some people therefore say this rules out all non-"necessary" cookies (unless there is explicit consent). However, this is not the intention of the directive, not how legal experts evaluated it, not how courts in particular evaluated it. If you followed that maximal view of the text, then you couldn't legally serve anything to a user (as the users browser might temporarily or permanently store that information without user-intervention), cannot "make" a browser cache stuff, cannot even store that a user opted against tracking cookies. Instead, it has to be seen in under the "confidentiality" umbrella of that Article, meaning the "information" mentioned has to be information that concerns the user. Non-identifying (neither direct or indirect) cookies do not fit that interpretation, and courts have acknowledged that (and because it's the EU and it's vast, some courts went against it too).

The proposed ePrivacy Regulation (successor to the ePrivacy Directive) is meant to make things less vague and simpler, especially in regards to cookies, and explicitly allows anonymous user counting via cookies, among other things. While the ePR has not passed, courts did take notice, and consider it whe they evaluate the intent of the law makers as it pertains to the still reigning ePrivacy Directive.

>They should be explained to the user, but consent is not necessary.

Correct. You still have to inform people, even if your cookie use is merely "we do not use cookies to track or identify users".

Maybe surprisingly to some, the aforementioned access logs up thread, are likely illegal without user consent, because usually they contain IP addresses of users. While the "visited=true" non-identifying cookie is not (in courts with reasonably knowledgeable judges at least).

[0] https://gdpr.eu/recital-30-online-identifiers-for-profiling-...

Yes, it's not the official website, but also yes, it's the same text of the official directive recitals, except on this unofficial website you can properly link it without fuss.

Yes. One of those things is legal to track without permission. The other isn't.
uniques is such a misnomer; let me switch to my phone ... oups I'm a second unique visitor.
it all depends on if you are a logged in user with a session or not. you can login to an account from any number of devices but you are still only one user in the metrics.
And you can still get that metric without third party trackers.
They certainly can provide uniqueness to some degree. GoatCounter [1] does that.

[1]: https://www.goatcounter.com

they do not have a kpi on "how long stayed the visitor on the page"
They don't, no. If optimising for that kind of thing is necessary for a business, then that business is in my opinion one that can go away.

It's like how search results are almost entirely rubbish now, because things are optimised for what Google looks for. So similarly, I have no sympathy for sites that need that kind of analytics.

Not a business. It helps to filter out crawler and bots. It tells me which content is interesting enough to read and worth updating (my personal wiki)
Are crawler access patterns really that non-uniform across pages and large enough to make this a problem? And for crawlers that are not immediately indentifiable from the user agent? Are you sure you are not just counting users without JS / a blocker that interferes with your tracking / intermittent connections as bots?

If you really care what users like on your site why not ask them?

Sadly no, but that's a different KPI than "how many visitors you get on your website".
it's quite trivial to create a breadcrumbs system which tracks in the logs a logged in user/session in an app with services like sentry.io
Services like this make it trivial to land in court, because they nudge their customers to collect data under the pretense of error analysis, a valid business interest not requiring consent, and then use the data for behavior analysis and profiling. If, as a user, you can't turn of the later without damaging the former, you got sold shit and should take your business elsewhere.
the apps i work on don't collect "behavioural data", so i don't have a strong opinion about it. however i think there are some crucial differences here.

1. sentry.io breadcrumbs are just a nicer interface to one's own log messages, and log messages are useful and necessary to have a well functioning app. where do log messages end and profiling/behaviour data begin? that's a rather fuzzy line.

2. even if one "logs" every breath the user takes (probably covered in the ToS), it's still only limited to one app and one service, while cookies are trivial to abuse for cross-site/cross-app tracking both inside and outside a company.

concerning the fuzzy line: log messages shouldn't include personal data (and in sentries defense they are trying to be helpful when it comes to that) Yet many people prefer to throw everything into the logs, arguing that much helps much and debugging without data is horrible. And suddenly the logs become a rich data swamp, and all that is needed is a nicer interface. So a lot of analysis that would otherwise require specific implementations or even user consent instead becomes data analysis of debug logs. That creates more incentive to throw everything into the swamp. And it makes it easier to forget its personal data: "If it's in the logs, i am not accessing the database i need permission to access." A lot of questionable personal data processing can be moved to the backroom of the backend, but that doesn't make the processing less questionable, just makes it easier to hide it from those subject to it, making it more illegal. Which is what i am warning about.

EU privacy regulations focus the purpose of personal data processing. If a company makes a contract with their users that says they log personal data for the purpose of debugging, and then they use it for web analytics, that is not allowed, its a violation of the contract. And like you stated many just write consent into the ToS. But let us look at the privacy friendly case where the users are asked if they agree to other behavior analytics not related to debugging. And suddenly the log interface isn't so nice anymore.

In a perfect world personal data is labeled with the purposes it can be used for.

If such issues are not relevant to the company you work for, be grateful, and don't take the warnings personal. But by all that is holy to you don't tell people the log interface is a great substitute to web analytics.

How can you claim it is limited to one app, if section 6.2 of the ToS of the service you and thousands of other companies use to manage logs says you allow them to create aggregations and summaries and distribute them to third parties?