Y
Hacker News
new
|
ask
|
show
|
jobs
by
curo
1065 days ago
OpenAI open sourced their evals framework. You can use it to evaluate different models but also your entire prompt chain setup.
https://github.com/openai/evals
They also have a registry of evals built in.