Hacker News new | ask | show | jobs
by ramraj07 201 days ago
Its a 2 day project at best to create your own bespoke llm as judge e2e eval framework. Thats what we did. Works fine. Not great. Still need someone to write the evals though.