Hacker News new | ask | show | jobs
by raunakchowdhuri 71 days ago
The big one is that LLMs get lazy on repetitive tasks. They'll skip rows or consolidate entries instead of grinding through every last one. So you need verify-and-re-extract loops rather than single-pass processing. Breaking work into sub-agent chunks with explicit correctness criteria defined upfront (e.g., "line items must sum to the stated total") lets the system self-verify autonomously. At scale (28M+ fields), this approach actually outperformed expert human labelers!