| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by daemonologist 475 days ago
	It says "wait" (as in "wait, no, I should do X") so much while reasoning it's almost comical. I also ran into the "catastrophic forgetting" issue that others have reported - it sometimes loses the plot after producing a lot of reasoning tokens. Overall though quite impressive if you're not in a hurry.

2 comments

huseyinkeles 475 days ago

I read somewhere which I can't find now, that for the -reasoning- models they trained heavily to keep saying "wait" so they can keep reasoning and not return early.

link

rahimnathwani 475 days ago

Is the model using budget forcing?

link

Szpadel 475 days ago

I do not understand why to force wait when model want to output </think>.

why not just decrease </think> probability? if model really wants to finish maybe or could over power it in cases were it's really simple question. and definitely would allow model to express next thought more freely

link

rahimnathwani 474 days ago

  why not just decrease </think> probability?

Huggingface's transformers library supports something similar to this. You set a minimum length, and until that length is reached, the end of sequence token has no chance of being output.

https://github.com/huggingface/transformers/blob/51ed61e2f05...

S1 does something similar to put a lower limit on its reasoning output. End of thinking is represented with the <|im_start|> token, followed by the word 'answer'. IIRC the code dynamically adds/removes <|im_start|> to the list of suppressed tokens.

Both of these approaches set the probability to zero, not something small like you were suggesting.

link

rosspackard 475 days ago

I have a suspicion it does use budget forcing. The word "alternatively" also frequently show up and it happens when it seems logically that a </think> tag could have been place.

link