|
|
|
|
|
by dingnuts
267 days ago
|
|
that's because a next token predictor can't "forget" context. That's just not how it works. You load the thing up with relevant context and pray that it guides the generation path to the part of the model that represents the information you want and pray that the path of tokens through the model outputs what you want That's why they have a tendency to go ahead and do things you tell them not to do.. also IDK about you but I hate how much praying has become part of the state of the art here. I didn't get into this career to be a fucking tech priest for the machine god. I will never like these models until they are predictable, which means I will never like them. |
|