|
Self driving isn't a sensor problem, its a software problem. From how humans drive, its pretty clear that there exists some latent space representation of immediate surroundings inside our brains that doesn't require a lot of data. If you had a driving sim wheel and 4 monitors for each direction + 3 smaller ones for rear view mirror, connected to a real world car with sufficiently high definition cameras, you could probably drive the car remotely as well as you could in real life, all because the images would map to the same latent space. But the advantage that humans have is that we have an innate understanding of basic physics from experience in interacting with the world, which we can deduce from something simple as a 2d representation, and that is very much a big part of that latent space. You wouldn't be able to drive a car if you didn't have some "understanding" of things like velocity, acceleration, object collision, e.t.c So my bet is that just like with LLMs, there will be research published at some point that given certain frames in a video, it will be able to extrapolate the physical interactions that will occur, including things like collision, relative distances, and so on. Once that is in place, self driving systems will get MASSIVELY better. |
Self-driving is still a robotics problem, and robots are probablistic operators with many component dependencies. If you have 3 99% reliable systems strung together running 24 hours a day, that's 43 minutes a day that it will be unreliable ((1 - .99^3)*1440). Multi-modality allows your systems to provide redundancy for one another and reduce the accumulating correlated errors.