Ironic, since the relatively recently discovered double descent makes it clear that bias-variance tradeoff as we know it from statistical learning theory simply doesn't apply to "overparameterized" deep models.
Much of old theory is barely applicable and people are, understandably, bewildered and in denial.
If someone were to be inclined to theory, I'd just recommend reading papers that don't try oversimplify the domain:
I don't believe it's oversimplifying the domain. Typically the reference I pointed to has a section dedicated to double descent (sec 11.2). You may also be surprised that such phenomenon can be observed on toy convex convex examples from "old theory" (sec 11.2.3), as you call it.
Anyways, I still believe that learning foundational stuff such as the bias-variance tradeoff is useful before diving to more advanced stuff. I even think that tackling recent research question with old tool is insightful too. But that's only my opinion, and perhaps I'm in denial :)
Anyways, I still believe that learning foundational stuff such as the bias-variance tradeoff is useful before diving to more advanced stuff. I even think that tackling recent research question with old tool is insightful too. But that's only my opinion, and perhaps I'm in denial :)