| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by neversupervised 62 days ago
	This is not how people use LLMs. If you ask one of these questions you’d get a longer answer, often grounded on the internet. I speculate that conditional on a smart human operator interpreting the results, such interpretations across vendors converge more often than this report makes it seem.

1 comments

Even then, there can often be substantive disagreements based on context. Hence the need for even a mostly true or mostly false bucket.