| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stephentmcm 3561 days ago
	Source? I'd hazard a guess that it's likely only marginally more accurate than lip reading?

1 comments

bradbeattie 3561 days ago

http://people.csail.mit.edu/mrub/VisualMic/

link

stephentmcm 3561 days ago

Pretty cool. But I'll eat my hat if a regular webcam can pick up enough detail in regular lighting to do this. Plus 90% of the time the webcam will capture a users chest and the wall behind them, hardly useful for visual microphones.

link

sillysaurus3 3561 days ago

Interesting technique. This requires 60fps 1280x720 video. People would surely notice the slowdown. I wonder if it's possible to improve on this?

link

willvarfar 3561 days ago

The saying is "attacks only get better". Its likely that more can be done with less pixels but more software.

But hardware can also get better. Surely the next-gen of laptops have depth-sensing cameras too? Its becoming an integral part of game console motion detection, and normal smartphones will have them too e.g. hype I found by googling: https://3dprint.com/117809/depth-sensing-phone-cameras/

link

sllabres 3561 days ago

This was an interesting thing to see when it came out and keep people aware what is possible. Maybe there is even more possible using this technique.

But Nevertheless activating one of the many microphones around (mobile phones, phones, laptops, "echo" like devices, speech controlled televison) would concern me much more then.

link

Shorel 3561 days ago

The range of the human voice goes from 85hz to 255 Hz, and that means a webcam should record at about 500hz to be able to capture enough information to reconstruct voice with good quality.

Because webcams record at 60hz (max), they can only capture enough data to reconstruct sound at 30hz, way below the human voice range.

link

T-hawk 3560 days ago

Webcams record faster than 60 Hz. Sound perturbs the recorded image every scanline, not just every frame. The techniques that reconstruct audio from video do it by looking not frame by frame, but line by line. 60 Hz times 720 vertical resolution is 43200 Hz and way more than enough data.

link