| HN Mirror

We have a setup that combines ToF, structured light and multiple colour cameras to reconstruct hands from the elbow down. Short version: it's a massive pain in the ass. In fact the setup really only works because we have a preconceived motion model (particular hand gestures) and have carefully arranged the scene to avoid interference. I'm unaware of a general solution where you can just throw more cameras in and get better scene data.

One neat thing though you might want to look at: if all you have is structured light (ie Kinect v1) you can simply attach a vibrating motor to each emitter/receiver to avoid a lot of interference per [0]

[0] https://wwwx.cs.unc.edu/~maimone/media/kinect_VR_2012.pdf