|
|
|
|
|
by agnosticmantis
596 days ago
|
|
These papers don’t explain how pertained LLMs learn in-context, because the simplified models in these papers are either pretrained for the same task that’s tested in-context, or the weights are handpicked by humans to do GD at inference time. See this video for a good discussion:
https://youtu.be/-yo2672UikU |
|