Hacker News new | ask | show | jobs
by kccqzy 755 days ago
An algorithm that processes private user data is by itself not invading anyone's privacy. It's clear to me that invasion of privacy only happens when humans look at private user data directly, or look at user data that's not sufficiently processed by an algorithm.

Otherwise, something as simple as a spell checker would be an invasion of privacy because it literally looks at every word in an email you write. That's absurd.

3 comments

At least in my opinion, there's a big difference with where the data lives and where the checking algorithm is run. I don't think a spell checker would fall into what I'd consider a privacy concern as long as the spell checker is running locally on my device.
I don't work in the area of email nor Google but I see two problems.

1) you need to constantly update the spell checker so each time you say this is word or something like that most likely the data is send the problem is part of the data, I assume Google do something similar whit data send to span and mark as not spam. This is full email redirect and analysis not partial like old word processing.

2)I feel ai make this even more harder so now you can't simply check patterns as simply as before, and you need to check the whole content constantly

We've had spell/grammar checkers in word processors that worked totally offline for a long time now. They definitely can be improved with a hosted service but that's by no means necessary and comes with tradeoffs like latency and offline support.
If an algorithm is looking through private stuff and making a decision based on it or is sending signals where the signal depends on the private stuff, then it's pretty much by definition leaking private information.

An algorithm that leaked no private information would not be useful to a business. It would do a bunch of computation and then throw it away. So realistically anything that looks at private information is privacy-relevant.

That includes even just the email headers. To quote the former head of the NSA "We Kill People Based on Metadata" https://abcnews.go.com/blogs/headlines/2014/05/ex-nsa-chief-...

You can have debates about how much private information should be leaked and for what purposes. But I don't think having a threshold like "it's all private unless another human reads it" is a good way to think about the issue.

An algorithm that denies service, changes ad behavior, etc based on user content is definitely invading privacy compared to your spell checker case.

The spell checker would also be a massive privacy invasion if if flagged users based on the content of what they wrote.