Ash HN: How do you do Evals on your Conversational ChatBots

Y	Hacker News new \| ask \| show \| jobs

	Ash HN: How do you do Evals on your Conversational ChatBots
	2 points by deepakthakur 691 days ago
	There are a lot of Eval frameworks, which work on the premise of <Question, Answer, Context> and give a score to LLM response. They work pretty well with cases where we expect a response to a query, and a context(ground truth) is provided in form of RAG. How can I use this paradigm for chatBot evaluation? Reason being that conversational bots also have chatHistory apart from the last Question in the chat, which doesn't seem to fit well in <Question, Answer, Context> format. Or are there other ways to do evals ? How has been your experience testing/evaluating ?