Hacker News new | ask | show | jobs
by Initial_BP 1341 days ago
Not sure about the current state of the actual ML, but compared to other self driving companies Tesla has a treasure trove of data because they have so many vehicles on the road at all hours of every day. The edge cases are the parts that are hard to identify and solve so having all that drive time data to identify edge cases would seem to give them a big advantage.
2 comments

Most self-driving is about avoiding collissions, and signalling intent, especially when streets are narrow and there's merging or shared use. The physics of cars, people, bikes and kids around roads are well understood (acceleration, velocity). This can be simulated, and a game engine can generate data for virtual sensors to be trained. There's no reason to require time on the road.
But you'll never be able to come up with all of the possible scenarios to simulate. What Tesla has demonstrated is creating virtual scenarios where they can dynamic adjust all factors (light, weather, traffic, etc) and base them off real world situations they've encountered where their Model failed.
Maybe not manually, but surely you could develop an adversarial ML model that quickly and concurrently tests scenarios.
What data is that model based off? Tesla has the data based on real world failures to build that model. Does anyone else?
You can't discount all the data they Waymo has collected over nearly a decade or the scenarios they've manually created. They also have the world's most complete map and spatial dataset, which could easily be extended to create a model that creates tricky roadways. Stimulating obstructions or hardware failures doesn't require very much data at all.

If you are modeling scenarios like a game engine, a "discriminator" model isn't necessary: you just check whether a simulation doesn't result in a crash.

I'm not discounting their data, I just think Tesla has so much more. If you were looking at just those opted into the FSD Beta you have a larger fleet actively running the model with feedback loops capturing every failure. But cars without FSD are still running the model and capturing data as well.
If the data doesn't have the details required to build accurate models, then the data is just costing Tesla money. Since Tesla's are just cameras only, with telemetry, they can replay scenarios with existing roads, but what happens when someone cones off half of the road?
I imagine they have a way to replay incidents from their cars in a simulation and even if it's not super accurate, they could likely look at camera data and rebuild a similar situation (Sim or real life to identify and test the edge case)

I work at another autonomous car company (as a security engineer not ML related work) and I know we have a Lot of simulated situations that we run the ML against and add more from situations collected from actual driving.