| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blamestross 282 days ago
	This seems less accurate than `return 1.0` Using the unboundedly unreliable systems to evaluate reliability is just a bad premise.

1 comments

lock1 282 days ago

Can't wait for (((LLM) Hallucination Risk Calculator) Risk Calculator) Risk Calculator to propagate & magnify the error even further! /j

link

cowboylowrez 282 days ago

have multiple llms and a voting quorum. sort of how we elect politicians. it'll work just as well I guarantee it!

link

wongarsu 282 days ago

Back in the GPT2 times I did use that technique. Also just running the model multiple times with slightly different prompts and choosing the most common response. It doesn't cure all problems but it does lead to better results. It isn't very good for your wallet though

link