Hacker News new | ask | show | jobs
by Closi 1753 days ago
> Nearly every team we heard technical details from has a “tracking subsystem” which integrates observations across time and sensor modalities.

That’s not object permanence, that’s tracking the same object across multiple frames and tracking it as a single object. The below is part of the abstract for a paper “Learning to Track with Object Permanence” released this year that describes the difference between current tracking and the concept of object permenance:

> Tracking by detection, the dominant approach for on- line multi-object tracking, alternates between localization and re-identification steps. As a result, it strongly depends on the quality of instantaneous observations, often failing when objects are not fully visible. In contrast, tracking in humans is underlined by the notion of object permanence: once an object is recognized, we are aware of its physical existence and can approximately localize it even under full occlusions.

Not sure if Tesla has it or not, but there is a difference between object permenance and tracking objects across frames.

2 comments

I don't know how the various car teams implement them currently, but I find it hard to believe that this is some novel or SOTA approach for them. Kalman filtering and state space models are explicit object permanence models for tracking objects (relying on the physical location/velocity model to track object locations when observations are missing), used since the 1960s, and that pretty standard material in engineering courses on navigation.
I hear that definition, but in terms of computer science this is not a categorical difference but one on a scale.

Clearly every tracking has to be able to re-localise among frames, otherwise it is not tracking. If you want to make a robust tracking you aim that it won't loose track even if the object is lost or obscured for a few second. These are all tuning parameters and questions of scenario. If you have a strong track of a vehicle with lots of evidence and you have strong priors about the road layout then the vehicle can disappear behind a bus for hours and you will still maintain the information that it is there. If you have a fleeting noisy observation about one pedestrian, and you don't really have a strong motion model about them (Because oh horror, pedestrians sometimes enter unmapped buildings and don't follow strict lanes!) then you might delete their track within seconds after they disappear.

So tracking creates information about objects, and how permanent they are is a tuning parameter. Some companies under some circumstances can choose to make the objects very permanent, some different companies or the same company under different circumstances can have very fleeting objects.

Given this, how much information would you need about the state of every single self driving system to write down a sentence like this confidently: "For a self-driving car, a bicycle that is momentarily hidden by a passing van is a bicycle that has ceased to exist."

I would be cautious writing such a sweeping generalisation even about bakeries and bread making, and that's a technology which has been practiced for thousands of years.

Here is what the author of the article could have done: Pick a specific failure of a specific self driving project and say "sometimes self driving cars struggle with object permanence. " It's not like you have to go far for an example. One of the root causes of the Elaine Herzberg accident was the car's inability to match her track among observations.