| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zepolen 757 days ago
	Question, how would you know without invading the user's privacy?

4 comments

kccqzy 757 days ago

An algorithm that processes private user data is by itself not invading anyone's privacy. It's clear to me that invasion of privacy only happens when humans look at private user data directly, or look at user data that's not sufficiently processed by an algorithm.

Otherwise, something as simple as a spell checker would be an invasion of privacy because it literally looks at every word in an email you write. That's absurd.

link

_heimdall 757 days ago

At least in my opinion, there's a big difference with where the data lives and where the checking algorithm is run. I don't think a spell checker would fall into what I'd consider a privacy concern as long as the spell checker is running locally on my device.

link

imachine1980_ 757 days ago

I don't work in the area of email nor Google but I see two problems.

1) you need to constantly update the spell checker so each time you say this is word or something like that most likely the data is send the problem is part of the data, I assume Google do something similar whit data send to span and mark as not spam. This is full email redirect and analysis not partial like old word processing.

2)I feel ai make this even more harder so now you can't simply check patterns as simply as before, and you need to check the whole content constantly

link

_heimdall 757 days ago

We've had spell/grammar checkers in word processors that worked totally offline for a long time now. They definitely can be improved with a hosted service but that's by no means necessary and comes with tradeoffs like latency and offline support.

link

ants_everywhere 757 days ago

If an algorithm is looking through private stuff and making a decision based on it or is sending signals where the signal depends on the private stuff, then it's pretty much by definition leaking private information.

An algorithm that leaked no private information would not be useful to a business. It would do a bunch of computation and then throw it away. So realistically anything that looks at private information is privacy-relevant.

That includes even just the email headers. To quote the former head of the NSA "We Kill People Based on Metadata" https://abcnews.go.com/blogs/headlines/2014/05/ex-nsa-chief-...

You can have debates about how much private information should be leaked and for what purposes. But I don't think having a threshold like "it's all private unless another human reads it" is a good way to think about the issue.

link

kortilla 757 days ago

An algorithm that denies service, changes ad behavior, etc based on user content is definitely invading privacy compared to your spell checker case.

The spell checker would also be a massive privacy invasion if if flagged users based on the content of what they wrote.

link

liquidgecka 756 days ago

Pre-AI we had a system that watched user patterns and would identify possibly suspect patterns that were outside of the norm. We also had system that would content-id the images and attachments to see what was going uploaded in a general way. Given enough suspicion then the account would be opened to look for abusive patterns.

There is absolutely no promise on any cloud hosted services that a human will not ever see your data. However, at Google it was made very, very, VERY clear that if we had to scan somebody's personal email for any reason then discussion of the contents outside of legally mandated, or required for work ways would lead to immediate termination and possible lawsuit for any damages to reputation incurred.

While fixing user accounts, or dealing with delivery of content I saw epic piles of personal email. Besides the ones full of CASM or other abusive material I couldn't say that I ever remembered the contents 30 minutes later. Its like a checker at a grocery store. They don't care about whatever embarrassing tings your buying and won't remember you 10 minutes later. =)

link

_trampeltier 757 days ago

I think there was a case, where several people loged in the same Gmail account and shared data not by sending mails, just by write and read drafts.

link

bottom999mottob 757 days ago

You might be thinking of the General David Patraeus case, a national security leak that was slightly worse than Snowden's, but with little repurcussions :)

link

liquidgecka 756 days ago

yep.. And it would split uploads across dozens of accounts with parity so that if any account was disabled it could re-create the data from what was in the other accounts. (think bittorrent using imap uploaded content in gmail)

link

carom 757 days ago

Companies are legally obligated to scan for CSAM in the US.

link

toast0 757 days ago

I don't think that's accurate... Do you have a link?

I do think there is an obligation to report if any is found, but I don't think they need to look.

link

_trampeltier 757 days ago

https://www.theguardian.com/technology/2022/aug/22/google-cs...

link

j16sdiz 757 days ago

I dont think that's a hard legal requirement to scan. Just some law around what to do once they are known, and some executive arrangements

link