Y
Hacker News
new
|
ask
|
show
|
jobs
by
fatso784
1014 days ago
ChainForge lets you do this, and also setup ad-hoc evaluations with code, LLM scorers, etc. It also shows model responses side-by-side for the same prompt:
https://github.com/ianarawjo/ChainForge