| HN Mirror

> Unless the requested hashes are saved for every single user,

It's 4 bytes by request. Google is keeping my whole history, much more than 4 bytes by request, and they do it for advertisers. I have no trouble believing a company partially owned by the government could afford 4 bytes by request.

Let say it's a billion address for each 4 bytes. You do 2 requests, how much of them will be on both list? Let's be generous and say half! You would only have to visit 30 uniques pages on that URL to find the domain. How often do you go on 30 different pages of a website? I feel it's quite regularly.

That's for a billion matches for each 4 bytes prefix, in reality it would be much less than that and there would certainly be much less than 50% matches between each prefix.

You can even do it in reverse even more easily. Go to a bunch of forum that you want to silence the users. Find their URL which allow to post on it. Get the 4 bytes prefix. Now you got a bunch of timestamped comments with username, and they are most likely pretty unique in the Tencent database. Now you found which IP is for which username.