| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bork1 2539 days ago
	I'm struggling to figure out how they have a Security and Response team to deal with the fallout of these issues without having enough privacy/security/customer-focused developers/product folks to proactively bring up these concerns. Google _seems_ like the type of company to do at least a little bit of risk modeling before the release of software. If they knew they were going to listen to recordings, how did this concern not get brought up? If it was brought up, did folks just decided it wasn't important enough to protect against?

2 comments

jhayward 2539 days ago

They have the security and response team activated because someone disclosed that they do this, not to investigate the fact that they do it. They're there to plug the leak.

link

azinman2 2539 days ago

Except the privacy policy always warned about this. Everyone doing speech recognition is doing this — you have to in order to get any kind of QA.

The caveat is that it should be both anonymizes as well as only in respond to the wake up command. It seems to be both, so I don’t see the problem.

Actually I do — the editorializing of these headlines makes it seem nefarious when it’s not.

link

vokep 2539 days ago

If the privacy policy was written in a language I could read (just because its english doesn't mean its readable english) then maybe I would have known that

link

gibba999 2539 days ago

It is pretty nefarious. In traditional research and product development protocols, you would have people opt into something like this, and optionally pay them for it.

If Google gave out a hundred thousand Google Home units for free to test subjects, with informed consent, there would be no big deal. It would cost Google $2.5 million, and it'd probably be enough data.

If my web site policy discloses "I may randomly send a thug to your house to shoot your children," and you come, visit, click through the license which warned you, and then I shoot your family, that doesn't mean I'm not doing something super-evil.

Google seems to be doing something super-evil here. Their response -- plugging the leak -- seems equally evil. People have a right to know what's being done with their data, and at least under European law, Google has a legal and ethical obligation to disclose things like this in language people can understand.

GDPR is rather well-written here. It looks like Google is breaking it, and currently trying to shoot the whistle-blower.

Thank you whistle-blower!

link

mehrdadn 2539 days ago

> If my web site policy discloses "I may randomly send a thug to your house to shoot your children," and you come, visit, click through the license which warned you, and then I shoot your family, that doesn't mean I'm not doing something super-evil.

You kinda had me until you lost me here. Analogies need to make sense. If you have to go this far with your analogy then that says more about your own argument than the other side's.

link

vharuck 2539 days ago

>If you have to go this far with your analogy then that says more about your own argument than the other side's.

I never got this argument. In mathematical proofs, reducto ad absurdum is an acceptable method of showing an assumption false. It shows that a statement ("Users agreed to TOS, so it's not malign") has an exception. The example is extreme to make sure nobody can argue the statement's still valid.

He's not saying the punishment should be on par with murder. He's just saying there is a line of moral acceptability, but where it lies is up for debate.

link

saagarjha 2539 days ago

You're missing the point, which is that you can slip anything into a privacy policy or other long agreement, no matter how outrageous it may be, and nobody will read it. Putting anything there does not make it ethical or legally binding.

link

UncleMeat 2539 days ago

It also doesn't make it unethical. Putting privacy related issues in a privacy policy makes sense to me.

link

gibba999 2539 days ago

You're confusing an analogy with a counterexample.

Analogies need to be analogous. Counterexamples can be extreme (and it is often helpful if they are; then they're obvious counterexamples).

Please take a minute to reread the discussion.

Coincidentally, I've noticed a pretty consistent pattern of downvotes on anything criticizing Google on Hacker News. Either a lot of readers from Google who drank the cool aid, or astroturf -- I'm not quite sure which.

link

azinman2 2539 days ago

What’s specifically super evil about humans transcribing random and anonymous commands to the google assistant? They’re hired and expected to be professional with their own contractual agreements around their own behavior and ethical standing.

Literally all the major companies in speech rec (aka assistants) do exactly this. The accuracy of the speech models would be extremely poor otherwise.

link

maccam94 2539 days ago

Come on, I'm sure Google's privacy policy allows them to listen to audio with no metadata in order to improve their service. The team is responding to the public leak of the audio, which is a violation of Google's privacy policy.

link

jhayward 2539 days ago

How does that contradict what I wrote above?

link

maccam94 2536 days ago

> the security and response team activated because someone disclosed that they do this

They're not chasing down a whistle blower for notifying the public that human transcription takes place. That information was already in the public domain in Google's privacy policy. The team is investigating the source of the leaked audio files, which was a violation of user privacy.

link

mda 2539 days ago

What do you mean? All tech companies train their algorithms by annotating real world samples. It was not a secret.

link

mcbuilder 2539 days ago

I think that the outrage doesn't come from supervised learning, rather that it's contracted out to a third party and seems to be done in a irresponsible way. I think that the fact that most of the public that uses these devices would be surprised by the fact that their voice is being recorded and transcribed is the irresponsible part. Of course a ML engineer is going to want to go the route of human labeling the audio data, and these folks seem to have won. You can blame the public for being uniformed, but it's new technology, and most aren't going to have read up on the methods used, even if it's no secret. Many have suggested other opt-in methods, which would also provide (arguably less real world) data. I think that many would prefer to trade a less accurate service for not having strangers listen to their conversations.

link

Mystrl 2539 days ago

Honestly what risk is there? Outside of some internet echo chambers no one really cares and all this will be forgotten by tomorrow if it even takes that long.

link