Y
Hacker News
new
|
ask
|
show
|
jobs
by
abid786
851 days ago
LLMs are better at this and it’ll probably marginally improve the output quality but there can potentially be hallucinations (false positives or negatives) even in this evaluation task.