|
|
|
|
|
by pu_pe
79 days ago
|
|
Fascinating stuff. Any chance of using a sparse autoencoder or some other method to try to grasp what the model is actually doing in those middle layers? It would be quite cool to get a better sense of what type of input it is getting in the first time it goes through the reasoning circuit compared to the second or third time. |
|