|
|
|
|
|
by lappa
359 days ago
|
|
This isn't suggesting no one understands how these models are architected, nor is anyone saying that SDPA / matrix multiplication isn't understood by those who create these systems. What's being said is that the result of training and the way in which information is processed in latent space is opaque. There are strategies to dissect a models inner workings, but this is an active field of research and incomplete. |
|