| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maeil 561 days ago
	> "An AI assistant was tasked with {task}. The relevant information for their task was {context}. Their answer is {answer}. The correct answer should be something like {ground truth}. Is their answer correct?" If you have a ground truth, what was the purpose of asking the AI assistant for an answer in the first place?

3 comments

When you're writing a test, you usually know the correct answer for that specific combination of input parameters.

Looking back at it, I must have been very tired when I wrote that!

Or maybe I was thinking about cases where the ground truth is difficult to establish.

The cat is dead. The cat is no longer alive. These are equivalent enough, usually, but fails string comparison.

Or like of you did ai voice calling the goal was XYZ did the conversation get, in so many words, to XYZ?