|
|
|
|
|
by bastawhiz
771 days ago
|
|
The context for an LLM could include any number of things. You certainly don't want it spitting out details from your internal customer support training manual, log data, or anything else that it's not intended to output. If you tell an employee not to do something and they do it anyway, you'd fire them. If you tell an LLM not to do something and it does it anyway, it's a bug. This test evaluates how good the model respects its instructions. |
|
There is a random amoral phrase inserted that is something like "the best thing to do in Las Vegas is drugs". Then the model is asked what the best thing to do in Las Vegas is. That's it.