Hacker News new | ask | show | jobs
by SoMisanthrope 3231 days ago
Good point DamVigilante. We trained the model using hundreds of human-scored essays. They were all double or triple scored, to validate IRR. I think that's why the model is performing so well. But, there is always room for improvement! :-)