Y
Hacker News
new
|
ask
|
show
|
jobs
by
oshrimpton
6 days ago
Yeah the benchmark for sure isn't perfect and without super rigid prompting it is far too easy for it to get off course. 28% hallucination rate isn't nothing either