Y
Hacker News
new
|
ask
|
show
|
jobs
by
valine
385 days ago
You’re thinking about this like the final layer of the model is all that exists. It’s highly likely reasoning is happening at a lower layer, in a different latent space that can’t natively be projected into logits.