|
|
|
|
|
by little_name
762 days ago
|
|
There is a thread below discussing tuning temperature and why it's not tied to creativity. I think the point is temperature and other sampling parameters don't have any inherent mechanism to control the adherence to training pattern as a whole (all top-X candidates are all from training data). Deviating from the max probability in next token generation has no relation with thinking out of box. It does give more generation results that may be lucky to hit something, but I doubt the cost effectiveness |
|