Hacker News new | ask | show | jobs
by sherbondy 3478 days ago
Hey, 3D vision system amateur here, but very interested to learn more!

Can anybody point me to some literature or reference materials about attempts to combine the inputs from multiple techniques simultaneously?

E.g. a device with stereo conventional cameras and infrared cameras & emitters which compares the resulting model from each input source/technique and actively re-adjusts final depth estimate?

Is "sensor fusion" the right jargon to use in this context?

Or, even crazier, a control system which actively jitters the camera's pose to gain more information for points in the depth map with lower confidence scores / conflicting estimates?

But maybe such a setup is overly complex and yields minimal gains in mixed indoor & outdoor scenarios?

1 comments

We have a setup that combines ToF, structured light and multiple colour cameras to reconstruct hands from the elbow down. Short version: it's a massive pain in the ass. In fact the setup really only works because we have a preconceived motion model (particular hand gestures) and have carefully arranged the scene to avoid interference. I'm unaware of a general solution where you can just throw more cameras in and get better scene data.

One neat thing though you might want to look at: if all you have is structured light (ie Kinect v1) you can simply attach a vibrating motor to each emitter/receiver to avoid a lot of interference per [0]

[0] https://wwwx.cs.unc.edu/~maimone/media/kinect_VR_2012.pdf