|
|
|
|
|
by dkarl
89 days ago
|
|
These interactions really don't get the testing they need. When they aren't designed, how do you know how to test? Over the weekend, I was directed to file a police report with a chatbot and could not complete it because it was asking for information that did not exist and did not apply to my case. (I'm sure somebody is going to say that this can be solved by having LLMs role play as victims and have an LLM observe and decide what's a failing test case and what isn't.) |
|