Hacker News new | ask | show | jobs
by mumblemumble 606 days ago
Perhaps only if you can also be very certain that the output is correct whenever the logprobs don't trigger the filter.

If that's not the case then it might just trigger bad risk compensation behavior in the model's human operators.