|
|
|
|
|
by tmnvdb
496 days ago
|
|
Interesting stuff. As the authors note, using latent reasoning seems to be a way to sink more compute into the model and get better performance without increasing the model size, good news for those on a steady diet of 'scale pills' |
|