|
|
|
|
|
by Aerroon
1042 days ago
|
|
I don't know the real answer to your question, but on local models you have a parameter you set that controls how many tokens to generate. It doesn't always follow it, it can end early, but sometimes it just keeps going. Usually though I can set it to generate 700 tokens and it will generate about 700 words. I wonder if the online chat models have a similar value somewhere. --- If you want the AI to remember something you will unfortunately have to keep reminding the AI of it in the prompt. With explicitly or you might refer to the previous generated text if it fits into the context. However, in local models the context can be limited (eg 2000 tokens). If the conversation goes above that 2000 tokens then the model will discard stuff from before. There are models with larger context sizes though. Lengthy prompts will cause the same issue though. The way things like SillyTavern role-playing work is that the model will constantly be reminded of some important attributes of the character that it's role-playing in the prompt (but it's done for you). |
|
It'd be cool if the API of LLMs would also allow for structured state like lists