| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by agnosticmantis 596 days ago
	These papers don’t explain how pertained LLMs learn in-context, because the simplified models in these papers are either pretrained for the same task that’s tested in-context, or the weights are handpicked by humans to do GD at inference time. See this video for a good discussion: https://youtu.be/-yo2672UikU