Hacker News new | ask | show | jobs
by bongodongobob 717 days ago
You have it write a program to analyze it. I think a lot of people fail to understand that you don't always need the LLM to do the thing, have it write a program to do the thing for you.
2 comments

That's not very likely to succeed, is it? LLMs can do a lot of things, but writing software that not only parses semi-proprietary file formats but also analyze unstructured data sounds more than little bit far fetched. I'd be impressed if just the first, and by far the easiest, part of that can be accomplished.
It's extremely likely to succeed because there is a documented format. I can't believe how pessimistic this site is about this stuff. Yeah, you're not going to one shot it with a prompt. If that's your expectation, you're confused.
Give it a go, then! No one would be more happy than me if you would prove me wrong.

Until then, I'd have to side with said pessimists here.

Okay, but you still need to debug the program. If your program must give correct results you still need to check the program output against every case. There's no free lunch there.
Speaking generally: The program doesn't always have to give correct results. The program just needs to reduce 30k documents down to 200 documents for human review.

You're comparing LLMs to a hypothetical alternative where a human reviews all 30k documents in detail. But the real alternative is often just a worse quality sieve where more errors blunder their way through the existing flawed processes. LLMs can improve on that.

The epistemology problem never goes away. How should I have any confidence that it's correctly flagging things for review? I need to go through 28800 documents to see if it missed anything.

You're right, I am comparing it to that alternative. There are fields and applications where this is necessary. I do not know if drilling reports are one of them. If you can tolerate a large false negative rate then great. But if you need to be catching 99.99% of problems then IMO you should at least be able to show your work. Taking black box output and throwing it over the wall sounds so sketchy in engineering contexts.

You can't have confidence, but my point is you often don't need confidence. All you need is an improvement on the flawed status quo.
Yeah I mean I had to move some big folders from server to server last week, maybe about 400. It was too random to script (would take longer to write the script) and I, as a human, doing it manually, still fucked up about 10%. 30k to 200 is exactly the stuff I'm talking about. The other people's existential dread is showing in this thread.
You're right. That's why to be sure I don't use software. All paper and pencil. So I can be sure. I have no idea what your point is.
I'm fine with writing software. I do so for a living. Usually when I'm responsible for a piece of software being correct, I'm the one who wrote it and not a black box. I use AI to autocomplete my code all the time and it very frequently suggests the wrong thing and attempts to insert random bugs.

So if my ass was on the line for the output of an AI-written program being correct for 30k cases of parsing unstructured or mixed data I would be extremely careful. That is my point.

Autocomplete is not in the same ballpark as intentionally prompting software.
Both processes produce bugs. And at any rate, LLMs are our best model for reading unstructured text. What program could an LLM possibly produce to read thousands of comments in natural language that would outperform, well, an LLM?