|
|
|
|
|
by jorleif
4595 days ago
|
|
I agree, but would like to add the aspect of uncertainty. When reasoning from analogy, you usually have good statistical knowledge about the properties of the problem. For example, the targetted product category may exist, and customer behavior is known, and therefore predicting what would happen in some nearby configuration is usually somewhat accurate. On the other hand, in the first principles case you need to have a very accurate theory, because you are "far away" from what exists currently. If the theory in questions concerns physics, as for Musk, then this can work well. On the other hand if it is about social sciences and you need to predict the behaviors of customer from some kind of first-principle model (e.g. rational agent models), then you are very likely to make mispredictions, since human behavior is complicated, and the theories are inaccurate and overly general. |
|
The good thing about gradient descent is that you do NOT need to have a model, you just need to focus on a few parameters and figure out what is the direction for best improvement from a current relatively good point, where the other billions of parameters are already accounted for and assumed independent from the direction you are going.