Hacker News new | ask | show | jobs
Show HN: Open Evaluation (openevaluation.ai)
3 points by cjcenizal 400 days ago
Hey HN, this will likely interest you if you're a) into dense data visualization or b) trying to figure out how to measure the quality of a retrieval-augmented generation (RAG) system.

There's an OSS tool called Open RAG Eval that analyzes RAG-based query-and-answer sets to generate a dense set of metrics in an "evaluation report". This report is in CSV format and the data is basically impossible for a human to read because there's so much of it.

I built Open Evaluation to enable folks to load in a report and visualize the evaluation metrics in a more human-readable way. The challenge was the sheer amount of information to visualize. I went with a collapsible table with sticky headers to presenting the info, so you can compare metrics across reports and questions. I also tried to make everything clickable, so if you want to understand the meaning behind a metric you can just click it to open up an info panel to learn more about it.

The site has built-in sample evaluation reports, so you can try it out without needing to generate your own reports. If you give it a shot please share your feedback. I'd love to find ways to make this more usable.

Full disclosure: I did this for work and my coworkers also made Open RAG Eval.