Hacker News new | ask | show | jobs
by testHNac 1134 days ago
If they didn't till today, they will start now.
1 comments

Why would random 1 second snippets of audio from when photos were being taken be useful to advertisers?
Ultrasonic beacon tracking + being able to tell if someone is indoors, in a vehicle, at a concert, sporting event, watching something on tv, etc
I would expect that you could tell if someone was indoors, in a vehicle, at a concert, or sporting event based upon the content of the photo.
Most photos have a timestamp and geotag. Knowing whether you're in a vehicle, at a concert or sporting event, or really doing just about anything can be gathered from that information as well as whatever the photos is of. One second of audio isn't giving much (additional) useful data.
Because there are billions of them.
All of those individual seconds don't add up to a sum greater than their parts. There are trillions (quadrillions?) of seconds of reality that those same cameras/microphones didn't capture. Capturing a single second of each of a billion people's lives isn't really all that useful, especially for advertisers.
"Based upon our spying people spend a large amount of time saying '...eese'. Can someone please find out what 'eese' is?"
Do people really even say that?
Once upon a time people did in fact say "Cheese" when taking pictures.
I’m willing to bet that an AI could learn a lot about a person by listening to a large number of short audio clips, together with the photos themselves.
One second isn't even long enough to hear the full pronunciation of all of most English words. Let's say people take ten photos per day. Let's generously say that captures ten random spoken words. Ten random words per day is hardly enough for a _human_ to learn anything about a person, let alone AI. AI cannot magically conjure data from noise.

And when you think about what people take pictures of (their parking spot, selfies, nudes, landmarks, birthday cakes, sunsets, cats), what's heard is likely not even relevant to the picture taker's life or interests. If I look at all of the photos I've taken in the last two weeks, I've got:

- Cat (2) - Building (1) - Stuff in my home (6) - Selfie (4)

All thirteen photos were taken in ~silence.

It takes 3 seconds of you speaking to clone your voice with AI
There are billions of ants
Training ML models
??? is this a meme
Why would this be a meme?
Because it's what people who have no clue say when they have run out of reasons for their conspiracy theory.

Q: Why would anyone do that? A: It's to train an ML model, maan.. to train an AI, maaan... to target ads, maaaan..

Short voice clips are used to train ML models. But you might also replace ML with ads in your comment and we could make a similar argument.