|
|
|
|
|
by pantsforbirds
485 days ago
|
|
Does DeepEval allow you to set up custom metrics without an LLM-as-a-judge base? If I want my result to be a JSON output, and I want to weight the keys based on some specific importance weighting, can I write a Python function/class to calculate and average those weighted scores as a metric for DeepEval? I do have some annoyances with DSPy, but I think their approach to defining evals is decent. |
|
If you're using DSPy, you can also include it directly in this custom metric from the link above. It's hard for me to say 100% if there are advantages of doing this within DeepEval, but 8/10 times running evals in our ecosystem brings you more benefits than drawbacks. Let me know if you have trouble setting up!