Hacker News new | ask | show | jobs
by shabie 640 days ago
That's actually a pretty interesting point. Not just evals but other components like system prompt should also be tailored to match the expected outcome.