|
|
|
Ash HN: How do you do Evals on your Conversational ChatBots
|
|
2 points
by deepakthakur
691 days ago
|
|
There are a lot of Eval frameworks, which work on the premise of <Question, Answer, Context> and give a score to LLM response. They work pretty well with cases where we expect a response to a query, and a context(ground truth) is provided in form of RAG. How can I use this paradigm for chatBot evaluation? Reason being that conversational bots also have chatHistory apart from the last Question in the chat, which doesn't seem to fit well in <Question, Answer, Context> format.
Or are there other ways to do evals ? How has been your experience testing/evaluating ? |
|