| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ydj 30 days ago
	How do you determine that the model was reasoning in Chinese in layer X? I would think the middle layers do not map into any tokens.

1 comments

s314 29 days ago

Using a logit lens (prior art: https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreti...)