Hacker News new | ask | show | jobs
by damvigilante 3235 days ago
I don't know about the specific project SoMisanthrope is talking about, but these types of tools are often used in conjunction with human graders. e.g. Instead of having 2 human graders, you automate the grading and have 1 human grader, and if the grades differ by some amount, only then do you bring in a secondary grader.
1 comments

Good point DamVigilante. We trained the model using hundreds of human-scored essays. They were all double or triple scored, to validate IRR. I think that's why the model is performing so well. But, there is always room for improvement! :-)