|
|
|
|
|
by jacob019
392 days ago
|
|
I don't think that's accurate. The logits actually have high dimensionality, and they are intermediate outputs used to sample tokens. The latent representations contain contextual information and are also high-dimensional, but they serve a different role--they feed into the logits. |
|
Reasoning is definitely not happening in the linear projection to logits if that’s what you mean.