Hacker News new | ask | show | jobs
by jhgb 1469 days ago
While all of this may be true, this doesn't explain why stereoscopic vision wouldn't work where a LIDAR would. Both provide identical geometrical information and neither has anything to do with AI. Neither tells you approximate weights of things, or judge based on human experience how things might move in the future depending on their type (tree vs car), or anything like that. And if you swap one system providing geometric information for another one that provides identical information, I don't see how this makes the cognition of any AI later in the pipeline magically any better, no matter how good or bad that AI was previously.

However, one benefit that long baseline stereoscopic vision (for example with cameras in corners of the front windscreen) would have compared to a short baseline stereoscopic vision (a human) or a point measurement (LIDAR) that could be relevant for safety would be the ability to somewhat peek around the vehicle in front of you from either side. Admittedly, this may overall be a small-ish benefit relative to a LIDAR but it does provide strictly more information (slightly) than a LIDAR would.

1 comments

Well, LIDAR uses very well understood physics to give you precise measurements of distance from the world around you, without any need for object recognition. It is not enough on its own, but it is an excellent safety technology. It's basically impossible to run into an object that's moving slow enough to avoid based on LIDAR input.

Stereoscopic vision first relies on object recognition of the elements of the pictures taken by each camera, then identifying the objects that are the same between the pictures, and only THEN do you get to do the simple physical calculation to compute distance. If your object recognition algorithm fails to recognize an object in one of the images; or if the higher-level AI fails to recognize that something is the same object in the two pictures, then the stereoscopy buys you nothing and you end up running into a bicycle rider crossing the street unsafely.

LIDAR does have limitations of its own (for example, it can't work in snowy conditions, since it will detect the snow flakes; not sure if the same applies to rain), but the regimes under which it is guaranteed to work are well understood, and the safety promises it can make in those regimes don't rely on ML methods.

> Well, LIDAR uses very well understood physics to give you precise measurements of distance from the world around you, without any need for object recognition. It is not enough on its own, but it is an excellent safety technology. It's basically impossible to run into an object that's moving slow enough to avoid based on LIDAR input.

Again, claiming that LIDARs make things magically safer sounds like a lot of snake oil to me. Both LIDARs and stereoscopic systems use well-understood physics. Stereoscopic rangefinders were being used in both World Wars for gun-laying and you wouldn't say that you don't need precise measurements for gun-laying.

> Stereoscopic vision first relies on object recognition of the elements of the pictures taken by each camera, then identifying the objects that are the same between the pictures, and only THEN do you get to do the simple physical calculation to compute distance. If your object recognition algorithm fails to recognize an object in one of the images; or if the higher-level AI fails to recognize that something is the same object in the two pictures, then the stereoscopy buys you nothing

As for whether stereoscopic vision relies on object recognition, that seems like a mild stretch to me. Generally it, like for example SfM (of which it is a special case), seems to rely on local textures and features for individual data points -- and in a simple single-dimensional stereoscopic vision case, your set of possible solutions is extremely limited, so matching features from SIFT or SURF in stereoscopic vision is way simpler than even the general SfM case. Those individual data points do not require in any way for individual objects to be recognized and separated. I have NOT seen in my life an SfM solution that would not give you a point cloud if it failed to separate objects -- in fact, SfM software doesn't even try to identify objects when generating a point cloud because it doesn't even operate at such a high level. Note that this actually provides the exact same information as a LIDAR would, namely a point cloud with no insight how the points are related to each other.

Pretty much the only situation where stereoscopic vision or SfM fails to provide depth information is with a surface of highly uniform color completely devoid of textures. Whether this could or couldn't be solved with structured light is an interesting problem.