Hacker News new | ask | show | jobs
by tsimionescu 1473 days ago
You're assuming that FSD is a software problem, and that there is any realistic chance that the feature will be available in the lifetime of any of the cars they sold including it.

Both of these beliefs seem unwarranted: FSD as described in the advertising is most likely much more than 5 years away, and will almost certainly require LIDAR to achieve any kind of safety.

People pay for a pre-order based on the promise that the item will be delivered. If I pre-order a game and it is later cancelled, or even delayed for many years, I will get my money back, I won't just be told "well, you knew it wasn't ready at the time".

2 comments

I don't think I said anything about hardware or whether FSD will ever be delivered in the lifetime of the current fleet.

Myself, I am skeptical of Elon's timelines. But I also understand they are not his promises, they are just his (foolish, imho) expectations.

He admitted he vastly underestimated the problem… but what he might not admit is continuing to do so. I think he continues to underestimate it.

On the other hand, the power of compounded returns of improvements over time is counterintuitive, and he probably understands that better than most of us. Maybe he used that understanding to get overconfident, or maybe it's still beyond reach. We really don't know. It's still possible that at some point, his team might just crack it. Not just with vision, though. They will need a world model for things like predicting the behavior of a group of children occluded by a bus near a crosswalk.

The money back thing is another question. I hope Tesla offers money back to ease the experience of those who are bitter, but I probably won't take it back myself, because I don't mind supporting the effort even though it seems like the results are far away. I don't think money back was an option previously. As a matter of company survival (which in Elon's mind equates to humanity's survival, take it or leave it, but suffice it to say he doesn't treat it as a normal throwaway company) Tesla just didn't have the money. Now, they probably do.

> I don't think I said anything about hardware or whether FSD will ever be delivered in the lifetime of the current fleet.

You said "Yes the actual finished FSD software release does not exist and Tesla does not claim it exists." (emphasis mine).

Even if you didn't say it, the whole false advertising investigation revolves around the difference between FSD being a software or a hardware problem. Tesla marketing and Musk personally have stated clearly (at least in the past) that all cars sold with the FSD option are FSD ready on the hardware side, and that FSD will be delivered as an over-the-air software upgrade to all of them once it's ready.

If they can indeed enable (working) FSD without a hardware upgrade, then they have not lied (even if the timelines they suggested were wildly optimistic). If they in fact need hardware upgrades to support FSD on the cars sold with this option, then they have lied in their advertising, and people who bought this are entitled either to a refund or to a free upgrade when the feature is available.

> and will almost certainly require LIDAR to achieve any kind of safety.

Is there a physical reason for that? We know that humans do just fine with just ~8cm of stereoscopic separation, and for example cars have the potential for a significantly higher amounts of stereoscopic separation.

Not a physical reason, no, but an AI one.

Humans and most other animals don't rely solely on stereoscopic vision to navigate the world, we rely on a model of the world where we recognize objects in the image we perceive, know their real size from experience, and use that as well as stereoscopic hints to approximate distances and speeds. We additionally use our understanding of basic physics to assist - we distinguish between an object and its shadow, we can tell the approximate weight of something by the way it moves in the wind (to know if we need to avoid an obstacle on the road), and there are other hints we take into account.

We also take into account our knowledge of the likely behavior of these objects to judge relative speeds (e.g. thr car is moving away, it's not the tree coming closer).

Without this crucial aspect of object recognition and experience about the world, our vision is actually very bad at navigation. If you put us in an artifical environment with, say, pure geometric shapes at various distances, no/fake shadows, objects with non-realistic proportions and so on, we will have much more trouble navigating and not bumping into things even at walking speeds. And this is the level the AI is currently operating at, more or less.

And if you don't believe me, note that humans with one eye, while having impaired depth perception, are still perfectly able to drive safely, with ~0 physical mechanisms for measuring distance (I beleieve the spherical shape of the iris may still give some very subtle hints about distance as you move your eye around, but that is minimal compared to stereoscopic vision). A LOT of our depth perception is just 2D image + object recognition + knowledge about those objects.

While all of this may be true, this doesn't explain why stereoscopic vision wouldn't work where a LIDAR would. Both provide identical geometrical information and neither has anything to do with AI. Neither tells you approximate weights of things, or judge based on human experience how things might move in the future depending on their type (tree vs car), or anything like that. And if you swap one system providing geometric information for another one that provides identical information, I don't see how this makes the cognition of any AI later in the pipeline magically any better, no matter how good or bad that AI was previously.

However, one benefit that long baseline stereoscopic vision (for example with cameras in corners of the front windscreen) would have compared to a short baseline stereoscopic vision (a human) or a point measurement (LIDAR) that could be relevant for safety would be the ability to somewhat peek around the vehicle in front of you from either side. Admittedly, this may overall be a small-ish benefit relative to a LIDAR but it does provide strictly more information (slightly) than a LIDAR would.

Well, LIDAR uses very well understood physics to give you precise measurements of distance from the world around you, without any need for object recognition. It is not enough on its own, but it is an excellent safety technology. It's basically impossible to run into an object that's moving slow enough to avoid based on LIDAR input.

Stereoscopic vision first relies on object recognition of the elements of the pictures taken by each camera, then identifying the objects that are the same between the pictures, and only THEN do you get to do the simple physical calculation to compute distance. If your object recognition algorithm fails to recognize an object in one of the images; or if the higher-level AI fails to recognize that something is the same object in the two pictures, then the stereoscopy buys you nothing and you end up running into a bicycle rider crossing the street unsafely.

LIDAR does have limitations of its own (for example, it can't work in snowy conditions, since it will detect the snow flakes; not sure if the same applies to rain), but the regimes under which it is guaranteed to work are well understood, and the safety promises it can make in those regimes don't rely on ML methods.

> Well, LIDAR uses very well understood physics to give you precise measurements of distance from the world around you, without any need for object recognition. It is not enough on its own, but it is an excellent safety technology. It's basically impossible to run into an object that's moving slow enough to avoid based on LIDAR input.

Again, claiming that LIDARs make things magically safer sounds like a lot of snake oil to me. Both LIDARs and stereoscopic systems use well-understood physics. Stereoscopic rangefinders were being used in both World Wars for gun-laying and you wouldn't say that you don't need precise measurements for gun-laying.

> Stereoscopic vision first relies on object recognition of the elements of the pictures taken by each camera, then identifying the objects that are the same between the pictures, and only THEN do you get to do the simple physical calculation to compute distance. If your object recognition algorithm fails to recognize an object in one of the images; or if the higher-level AI fails to recognize that something is the same object in the two pictures, then the stereoscopy buys you nothing

As for whether stereoscopic vision relies on object recognition, that seems like a mild stretch to me. Generally it, like for example SfM (of which it is a special case), seems to rely on local textures and features for individual data points -- and in a simple single-dimensional stereoscopic vision case, your set of possible solutions is extremely limited, so matching features from SIFT or SURF in stereoscopic vision is way simpler than even the general SfM case. Those individual data points do not require in any way for individual objects to be recognized and separated. I have NOT seen in my life an SfM solution that would not give you a point cloud if it failed to separate objects -- in fact, SfM software doesn't even try to identify objects when generating a point cloud because it doesn't even operate at such a high level. Note that this actually provides the exact same information as a LIDAR would, namely a point cloud with no insight how the points are related to each other.

Pretty much the only situation where stereoscopic vision or SfM fails to provide depth information is with a surface of highly uniform color completely devoid of textures. Whether this could or couldn't be solved with structured light is an interesting problem.

Human stereoscopic vision could also be fooled by specifically designed optical illusions in science museums. We just avoid them when designing roads.