| HN Mirror

> I'd be careful of over applying the "bias-variance tradeoff." How to define the variance of a model is not a simple task. I wouldn't say it is immediately obvious how bias-variance relates to small data scenarios.

It's very important for anyone studying ML to understand how bias-variance relates to sample size, so I'd encourage finding more resources if this note didn't help clarify! Here's another shot at a summary: for a fixed size of training sample, you must trade off between sensitivity to randomness in the sample (variance) and assumptions that bias the model you train.

It's true that quantifying variance and bias can be hard, and you need systems like PAC learning to go further and actually estimate sufficient sample sizes for a task. But you can still reason usefully about any system that involves using data to select (train) among a class of potential output models!

For example, the statement about meta-learning is incorrect, at least as far as i've seen the term used. Meta learning involves learning hyperparameters (including functions) that are then used to train a model. The extra stage makes these models less biased, but actually require more data. (Of course, in some meta learning systems, hyper parameters are learned with the help of external data - a form of transfer learning.)