Hacker News new | ask | show | jobs
by pona-a 478 days ago
In-context learning is a very different problem from regular prediction. It is quite simple to fit a stationary solution to noisy data, that's just a matter of tuning some parameters with fairly even gradients. In-context learning implies you're essentially learning a mesa-optimizer for the class of problems you're facing, which in the form of transformers means essentially means fitting something not that far from a differentiable Turing machine with no inductive biases.