Hacker News new | ask | show | jobs
by delusional 1202 days ago
I've been playing around with the 30B version all day. The biggest improvement I've seen have come from changing the way I prompt (strike a more in medias res style, the model really likes continuing and gets confused if you give it a blank slate), and implementing top_k sampling (also discard the top_p=0 nonsense, you want top_p>1.0 to turn it off). It's important to note that the llama.cpp project does NOT implement top_k, even if you set that commandline parameter.
1 comments

top_k is now implemented