|
|
|
|
|
by camelmel
20 days ago
|
|
LLM written article. It's also not accurate; the fact that language models have human-interpretable representations and neurons has been known since BERT. Circuits research also does not come from Anthropic. Mech interp is a huge field in academia and most of the core circuit analysis papers were from OpenAI/GDM/academia. However, Anthropic tends to produce a lot of blog posts where they draw poorly supported but hype-able analogies between LLMs and biological intelligence. It's wild. For a better understanding of mech interp and circuits, including what we actually do know about LLM internals, I would recommend reading this paper: https://arxiv.org/pdf/2501.16496 |
|
> the fact that language models have human-interpretable representations and neurons has been known since BERT... Circuits research also does not come from Anthropic... The article does not claim Anthropic invented the field, rather that they have had important contributions to it. This is intended as an overview into a specific set of ideas that are working for mechanistic interpretability. Not a formal literature review.