Hacker News new | ask | show | jobs
by abid786 851 days ago
LLMs are better at this and it’ll probably marginally improve the output quality but there can potentially be hallucinations (false positives or negatives) even in this evaluation task.