|
|
|
|
|
by ggerganov
519 days ago
|
|
There are 4 stopping criteria atm: - Generation time exceeded (configurable in the plugin config) - Number of tokens exceeded (not the case since you increased it) - Indentation - stops generating if the next line has shorter indent than the first line - Small probability of the sampled token Most likely you are hitting the last criteria. It's something that should be improved in some way, but I am not very sure how. Currently, it is using a very basic token sampling strategy with a custom threshold logic to stop generating when the token probability is too low. Likely this logic is too conservative. |
|
I didn't catch T_max_predict_ms and upped that to 5000ms for fun. Doesn't seem to make a difference, so I'm guessing you are right.