| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xilinx_guy 1233 days ago
	We obviously need a new test. The new benchmark for large language models should be "Truth" with a numeric score defined as -Log( Percentage_of_Lies_Told ). This way, a perfectly truthful model will have a numeric score of +infinity.