Hacker News new | ask | show | jobs
by energy123 717 days ago
Speaking generally: The program doesn't always have to give correct results. The program just needs to reduce 30k documents down to 200 documents for human review.

You're comparing LLMs to a hypothetical alternative where a human reviews all 30k documents in detail. But the real alternative is often just a worse quality sieve where more errors blunder their way through the existing flawed processes. LLMs can improve on that.

2 comments

The epistemology problem never goes away. How should I have any confidence that it's correctly flagging things for review? I need to go through 28800 documents to see if it missed anything.

You're right, I am comparing it to that alternative. There are fields and applications where this is necessary. I do not know if drilling reports are one of them. If you can tolerate a large false negative rate then great. But if you need to be catching 99.99% of problems then IMO you should at least be able to show your work. Taking black box output and throwing it over the wall sounds so sketchy in engineering contexts.

You can't have confidence, but my point is you often don't need confidence. All you need is an improvement on the flawed status quo.
Yeah I mean I had to move some big folders from server to server last week, maybe about 400. It was too random to script (would take longer to write the script) and I, as a human, doing it manually, still fucked up about 10%. 30k to 200 is exactly the stuff I'm talking about. The other people's existential dread is showing in this thread.