|
|
|
|
|
by agautsc
968 days ago
|
|
if you build a dataset of question with responses to test you rag app with this metrics package, how do you know whether the distribution of questions match in any way with the distribution of question you'll get from the app in production? using a hand made dataset of questions and responses could introduce a lot of bias into your rag app. |
|
tvalmetrics introduces 6 RAG metrics: answer similarity, retrieval precision, augmentation precision, augmentation accuracy, answer consistency, and retrieval k-recall. Of these 6 metrics, only answer similarity requires reference answers, so you can use the other metrics to measure the performance of your RAG system when you have a test dataset of questions without reference "correct" answers.