Hacker News new | ask | show | jobs
by abdullahkhalids 2016 days ago
> How can Plausible Analytics count unique visitors without cookies?

> So if you don’t use cookies how do you count the number of website visitors and report on metrics such as the number of unique users?

> Instead of tagging users with cookies, we count the number of unique IP addresses that accessed your website. Counting IP addresses is an old-school method that was used before the modern age of JavaScript snippets and tracking cookies.

> Since IP addresses are considered personal data under GDPR, we anonymize them using a one-way cryptographic hash function. This generates a random string of letters and numbers that is used to calculate unique visitor numbers for the day. Old salts are deleted to avoid the possibility of linking visitor information from one day to the next. We never store IP addresses in our database or logs.

...

> In our testing, using IP addresses to count visitors is remarkably accurate when compared to using a cookie. Total unique visitor counts were within 10% error range with IP-based counting usually showing lower numbers.

From here: https://plausible.io/blog/google-analytics-cookies#can-you-g...

2 comments

A one way hash of an IPv4 address is no more private than the address itself. If you know the has algorithm, you can build a rainbow table of all the hashes in under a second. Even with a random salt it doesn't take long to build a rainbow table with all possible salts.
Doesn't that depend on the size of the salt?
To an extent, but there are easy ways to cut the search space. For example, you could make a unique request with garbage on it from a known IP every day, and then all you have to do is build a rainbow table for that one IP to find out what the salt is for each day, and then you can fully reconstruct the logs.
If the salt is a random 64bit number (for example) then "finding out" the salt is not trivial.
And unless I'm missing something, it seems easy to add plenty of bits to the salt until it's no longer practical to reverse.
@mattlondon: The salt is known to plausible, that is the only way someone can hash it.
This would be woefully inaccurate for websites with a large amount of mobile traffic (because of CGNAT), or university traffic, or etc.
Don't universities have a huge number of IPs because they were the first to use internet ?

Mine gives one public ipv4 per device that access the internet on the network (with some exceptions). Strategies varies but if you have a lot of addresses why not use them.

That might be true for some US universities, but it's definitely not true for the rest of the world.
According to Google, IPv6 traffic is up to 30% these days.