|
|
|
|
|
by mrtksn
946 days ago
|
|
Maybe every response can be reviewed by a much simpler and specialised baby-sitter LLM? Some kind of LLM that is very good at detecting a sensitive information and nothing else. When suspects something fishy, It will just go back to the smart LLM and ask for a review. LLMs seem to be surprisingly good at picking mistakes when you request to elaborate. |
|
This doesn't really work in practice because you can just craft a prompt that fools both.