|
|
|
|
|
by tikkun
1120 days ago
|
|
> Are there papers that peruse what kind of concepts the model is actually building/learning in those heads and layers? > There are large teams who spend months tuning those models. Do those teams have access to those internal concepts that the model built up and organized? Is any of this work public? See: https://openai.com/research/language-models-can-explain-neur... My understanding: Generally, the models are compressing their understanding of all text, and in doing so, they're learning high order concepts that allow their compression of all the text they were fed during pre-training to be a better compression - more compressed, and less loss. |
|
Are these higher order concepts accessible to us? E.g. can we list those learned concepts?
(Re-reading the paper you linked now...)