| HN Mirror

It doesn't have to lag, though. You could ask gpt-2 to explain gpt-2. The weights are just input data. The reason this wasn't done on gpt-3 or gpt-4 is just because a) they're much bigger, and b) they're deeper, so the roles of individual neurons are more attenuated.