Hacker News new | ask | show | jobs
by spion 1063 days ago
Its greedy and random :) Instead of a paper, I would recommend the algorithms of most LMM implementations (rwkv.cpp has a relatively clean implementation in python https://github.com/saharNooby/rwkv.cpp/blob/master/rwkv/samp...)
1 comments

I guess I need to sit down and study this stuff in more detail, but do I understand correctly that the code you shared makes the decisions for each position independently? I am just astonished that this produces any coherent output. Also it is not clear to me how the length of the output sequence is determined.
Once the stop token is likeliest