Y
Hacker News
new
|
ask
|
show
|
jobs
by
ydj
30 days ago
How do you determine that the model was reasoning in Chinese in layer X? I would think the middle layers do not map into any tokens.
1 comments
s314
29 days ago
Using a logit lens (prior art:
https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreti...
)
link