Hacker News new | ask | show | jobs
Track and visualize LLM model performance over time (github.com)
1 points by anjneymidha 323 days ago
1 comments

this is a really neat project: "an automated, daily evaluation suite to track model performance over time, monitor for regression during peak load periods, and detect quality changes across flagship LLM APIs."