|
|
|
|
|
by yelmahallawy
107 days ago
|
|
And I think this is a common problem actually — figuring out what to measure and how to measure it – it's not black and white. What I do is have a few dimensions to measure it against (this may or may not fit your use case): relevance, instruction following, clarity, hallucination rate, etc. but even then, it becomes hard to measure things like 'clarity'. |
|