|
|
|
|
|
by xilinx_guy
1233 days ago
|
|
We obviously need a new test. The new benchmark for large language models should be "Truth" with a numeric score defined as -Log( Percentage_of_Lies_Told ). This way, a perfectly truthful model will have a numeric score of +infinity. |
|