Hacker News new | ask | show | jobs
by onurcel 1438 days ago
in this work we tried to rely not only on automated evaluation scores but also on human evaluation for exactly this reason: we wanted to have a better understanding of how our model actually performs and how it correlates to automated scores.