| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wodow 989 days ago
	Anthropic at https://x.com/anthropicai/status/1709986949711200722 > The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.