Hacker News new | ask | show | jobs
by cousin_it 2427 days ago
Grandparent's statement is pretty absolute, but I find myself in agreement with it. Data collection is the right place to intervene, because once collected, data can be copied and misused at any time in the future.

> when you read this comment you'll have loaded a page on HN. That means HN's server probably has a log of your IP address, browser agent string, etc.

Such logging isn't technically necessary to serve web pages, and ideally shouldn't be done without consent.

> Am I spying on you when I read those pages?

That's not spying, because the user consented to making their comments public. (Not sure about favorites though, there's a small note on the profile page but maybe the favoriting action should make it more explicit.)

> Google Analytics isn't spying on you when it tracks everything you do on 50% of the websites you visit.

It's spying if you didn't consent to it.

3 comments

> Such logging isn't technically necessary to serve web pages, and ideally shouldn't be done without consent.

It's needed as soon as you want to do: non-trivial spam protection, context connection for errors/exceptions, dos mitigation, correlation of issues across browsers, and a few other things.

For most of those you could theoretically hash the IP because you're interested in matches not actual values (although matching either the AS or at least /24 makes things easier). But until we migrate to IPv6 hashing doesn't make sense (and once we move, keeping individual addresses doesn't make sense).

Basically the bigger the site, the more important that information is for operations.

You can do all of those things without logging that information. It’s a cheaper solution to the problem, but that does not mean it’s required.

Which devolves your argument into collecting this information is significantly more profitable. Which I think is generally accepted as true, but not nessisarily enough to make it acceptable.

How would you match traffic from the same source without keeping the record of that source?
That’s a technique not a goal. What are you trying to do?
Find when a specific endpoint / AS / country starts sending dos levels of traffic, (or hack attempts) so they can be banned.
Rate limiting prevents a specific IP from causing a successful DoS. You can log higher level information like county without linking it to a specific user.

In terms of hacking, building a secure site prevents this problem at the source. Banning specific IP’s in a world of proxies and public WiFi is almost useless.

It's spying if you didn't consent to it.

There's no explicit consent but the fact you've told your computer to download some code and run it looks a lot like implied consent.

I think that argument proves too much. To a user browsing the web, clicking a link that says "check out this nice article" signifies intention/consent to read that article, not to suffer the effects of all possible JS tripwires including pwning their computer and such.
This is the point I was making about misuse of data. Thinking usage analytics on a website is a tripwire is quite extreme. Thinking that building a complete profile of someone based on their activity on lots of websites is a tripwire is quite reasonable. Hence the difficulty in defining what 'spying' really is.
If by analytics you mean something like a hit counter from the 90s, which doesn't require recording user sessions, then I agree with you. But if it's recording user sessions, I think it's a good idea to require consent for that.
Sure, but all this tracking isn't a product of JS tripwires pwning computers: it's a natural result of downloading an article from a server.
No, it's a result of the article telling the browser to also download and execute analytics scripts. It's abusing the good faith HTTP protocol was built upon. That's why I consider ad/content blockers OK and desirable. They're a way for users to express that they don't consent to loading and execution of some resources.
That's true of malware too. Consent is different from actions.
Then its the misuse itself that need to be fixed.

Its like knife can be used to kill people, lets get rid of knife instead.