|
|
|
|
|
by XenophileJKO
135 days ago
|
|
Hmm.. I looked at the benchmark set. I'm conflicted. I don't know that I would necessarily want a model to pass all of these. Here is the fundamental problem. They are putting the rules and foundational context in "user" messages. Essentially I don't think you want to train the models on full compliance to the user messages, they are essentially "untrusted" content from a system/model perspective. Or at least it is not generally "fully authoritative". This creates a tension with the safety, truthfulness training, etc. |
|