|
|
|
|
|
by aimkey
1823 days ago
|
|
It seems as if the people gobbling up the "Tesla has the data! Autopilot will keep getting better!" line have never trained a neural network in their life. Models converge. Loss stops decreasing, regardless of more incoming data. Extreme manual data cleaning effort becomes required to prevent overfitting. Model architecture has to change and hyper parameters have to be tweaked. Then you're back at square one as far as testing goes if you change any of those things. The notion that Tesla's model HAS to keep improving simply because they will be able to pile on more (unlabeled!) data is laughably false. And, in fact, quite insulting to the intelligence of even the most casual ML engineers. |
|
Exactly, casual ML engineers. The issue of plateauing tends to occur because there is no more novelty to be had in the data. What mega-experiments like GPT and similar have shown us is that actually you can keep adding novel data and keep improving the model. Kinda inelegant, yet effective. The problem is, most institutions can't add more novelty beyond a certain scale, since that usually means shoveling more money at data storage and compute, on top of the novelty collection.
Tesla merely has to open the money tap to get more of both compute and storage, and let the real-time data flow in.