|
|
|
|
|
by shahules
1045 days ago
|
|
Evals is not suitable for evaluating LLM applications such as RAG, etc because one has to evaluate on their own data where no golden test data exists, and techniqus used have poor correlation with human judgement.
We have build RAGAS framework for this https://github.com/explodinggradients/ragas |
|