|
|
|
|
|
by utensil4778
717 days ago
|
|
A hard restriction would prevent a malicious prompt from being passed to the model. Instead, it seems they've simply asked the model nicely to pretty please not answer malicious prompts. A hard restriction would be a regex or a simpler model checking your prompt for known or suspected bad prompts and refusing outright. |
|
If NLP was that easy, we wouldn't have needed to invent transformer models, and we'd have had things as capable as ChatGPT about the same time that Microsoft was selling Encarta on CD.
The reality is, this soft fuzzy thing is the only practical way to minimise the Scunthorpe problem (and its equivalents for false negatives): https://en.wikipedia.org/wiki/Scunthorpe_problem