| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by marko-k 570 days ago

Not OP, but I work on AI in higher ed at a major university.

I get the concerns about AI grading. The solution isn't to have AI grade entire assignments at once. Instead, break down the assessment into smaller, discrete tasks and develop a grading rubric around those. The idea is to limit how the AI can respond - usually to simple binary choices like completed/not completed, true/false, etc. (Also, the models have been RLHF’d to generally put a positive spin on things, so if anything they’re likely to be overly generous in assessment.)

From there, provide the AI with the answer key, student response, rubric, and any other necessary context then use the Structured Outputs API to force consistent responses for each discrete task. I've had the most success using boolean values or simple enums (like "Correct", "Partially Correct", "Incorrect"). You can include a field for reasoning, then chain AI calls to get a second assessment as verification.

That's the high-level gist of it, though I'm skipping a lot of details. I have a basic demo of how this works on my site if you're interested: https://www.markokrkeljas.com/projects/real-time-task-tracki...