| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by anamexis 1135 days ago
	Why would random 1 second snippets of audio from when photos were being taken be useful to advertisers?

3 comments

alephxyz 1135 days ago

Ultrasonic beacon tracking + being able to tell if someone is indoors, in a vehicle, at a concert, sporting event, watching something on tv, etc

link

SketchySeaBeast 1135 days ago

I would expect that you could tell if someone was indoors, in a vehicle, at a concert, or sporting event based upon the content of the photo.

link

bastawhiz 1135 days ago

Most photos have a timestamp and geotag. Knowing whether you're in a vehicle, at a concert or sporting event, or really doing just about anything can be gathered from that information as well as whatever the photos is of. One second of audio isn't giving much (additional) useful data.

link

NotYourLawyer 1135 days ago

Because there are billions of them.

link

bastawhiz 1135 days ago

All of those individual seconds don't add up to a sum greater than their parts. There are trillions (quadrillions?) of seconds of reality that those same cameras/microphones didn't capture. Capturing a single second of each of a billion people's lives isn't really all that useful, especially for advertisers.

link

SketchySeaBeast 1134 days ago

"Based upon our spying people spend a large amount of time saying '...eese'. Can someone please find out what 'eese' is?"

link

TeMPOraL 1134 days ago

Do people really even say that?

link

SketchySeaBeast 1134 days ago

Once upon a time people did in fact say "Cheese" when taking pictures.

link

TeMPOraL 1134 days ago

Yes, but I thought that this was long ago, and since then, saying this got... cheesy.

link

NotYourLawyer 1134 days ago

I’m willing to bet that an AI could learn a lot about a person by listening to a large number of short audio clips, together with the photos themselves.

link

bastawhiz 1134 days ago

One second isn't even long enough to hear the full pronunciation of all of most English words. Let's say people take ten photos per day. Let's generously say that captures ten random spoken words. Ten random words per day is hardly enough for a _human_ to learn anything about a person, let alone AI. AI cannot magically conjure data from noise.

And when you think about what people take pictures of (their parking spot, selfies, nudes, landmarks, birthday cakes, sunsets, cats), what's heard is likely not even relevant to the picture taker's life or interests. If I look at all of the photos I've taken in the last two weeks, I've got:

- Cat (2) - Building (1) - Stuff in my home (6) - Selfie (4)

All thirteen photos were taken in ~silence.

link

tough 1134 days ago

It takes 3 seconds of you speaking to clone your voice with AI

link

renewiltord 1134 days ago

You can get that from me by calling and asking if Henry is there. I will answer "No, I'm sorry, but you must have the wrong number". Cheap with Twilio.

link

bastawhiz 1133 days ago

If they have access to your pictures, they have access to your videos. This matters because people don't think audio is being recorded when they take photos. As far as threat modeling goes, creating a cloned voice is something these apps could have already done.

link

fauxpause_ 1134 days ago

If you accept that as true then you also have to accept that your voice is hopelessly copyable and defense against that is futile. So it’s not really important.

link

fauxpause_ 1135 days ago

There are billions of ants

link

godelski 1134 days ago

Training ML models

link

hackernewds 1134 days ago

??? is this a meme

link

godelski 1134 days ago

Why would this be a meme?