|
|
|
|
|
by blah2244
490 days ago
|
|
I've heard of a startup that claims to be able to achieve a near-0% false positive rate: https://www.pangram.com/our-model/how-it-works They appear to basically RLHF a model on a bunch of examples of human/AI output on the same prompt. Not sure how well it works, but I'm guessing Mozilla is doing something similar here. |
|
https://en.m.wikipedia.org/wiki/Precision_and_recall