Hacker News new | ask | show | jobs
by ukulerok 303 days ago
I just wrote a complex prompt and it did a good job. How do you do evals or testing of your project?
1 comments

Thanks for trying it out! We rely on a mix of internal benchmarks and academic benchmarks like WebVoyager.