Hacker News new | ask | show | jobs
by joshuaisaact 68 days ago
This was a really interesting paper but there's a massive gap in what they didn't try, which is inference-time temperature changes based on the fork/lock distinction.

Maybe I'll try that myself, because it feels like it could be a great source of improvements. It would be really useful to see adaptive per-token sampling as an additional decode-only baseline.

1 comments

Is this some kind of calibration then? I'd expect that the probabilities automatically adjust during training, such that in "lock" mode, for example, syntax-breaking tokens have a very low probability and would not be picked even wich higher temperature.