|
|
|
|
|
by pyentropy
4 days ago
|
|
The number of tokens you predict at time (multi or not) has nothing to do with whether the model wants to emit any, some or a lot of reasoning tokens in reasoning tag -- similar to how branch prediction will not really change the for loop iteration count. |
|