Hacker News new | ask | show | jobs
by kumarvvr 2846 days ago
A semi-off topic question. Is it not possible to get an accurate depth map based on a two camera stereoscopic setup? Like human eyes? Perhaps combine it with video processing to isolate objects at different depths.
3 comments

I was watching a talk from Cruise that mentions this. The main problem with cameras is dynamic range. Dealing with different lighting conditions that can change quickly is hard (the sun is really good at washing out colors). Lidar doesn't care about the current lighting conditions.

https://youtu.be/s-8cYj_eh8E?t=22m39s

Also heavy rain would be a problem for regular cameras. Not just seeing through the airborne droplets, but also (at a guess far more significantly) the water directly in contact with the windscreen causing severe random distortions.
I built a hacky prototype, combining:

- FLIR thermal camera

- 3 different small cameras manually set at different settings, models chosen for their qualities handling light levels.

Those 4 live feeds were fed into a small black magic design quad layout device, that turned them into a single hd feed via hardware/real-time. That was fed into a hardware capture, that stacked the quad arrangement, applied some other filters and did hardware compression. At that point almost no latency was introduced but had a nice working base video feed. That was fed into the Linux box for processing.

The quad device created a sort of super hdr video, and the thermal layer took it to the next level. All of the cameras had drawbacks, but combined they were minimized.

but heavy rain and also snow are also problems for lidar.
the images need texture and edge detail to pick up on depth. Any portion of an image that is a solid colour would need to be interpolated in some way, possibly inaccurately. Occlusion is also a bit of a problem resulting in gaps in the depth map that need to be filled somehow.

here's an old comparison of algorithms. I imagine the state of the art has improved with Deep Neural Nets recently.

http://vision.middlebury.edu/stereo/eval3/

edit: surprise! the page appears to be kept up to-date with new algorithms and recent techniques, and indeed the top performer is from 2018.

It is and works great, see Subaru