Y
Hacker News
new
|
ask
|
show
|
jobs
by
ukulerok
303 days ago
I just wrote a complex prompt and it did a good job. How do you do evals or testing of your project?
1 comments
antves
303 days ago
Thanks for trying it out! We rely on a mix of internal benchmarks and academic benchmarks like WebVoyager.
link