Hacker News new | ask | show | jobs
by harrall 550 days ago
Right now I can ask an (experienced) human to do something for me and they will either just get it done or tell me that they can’t do it.

Right now when I ask an LLM… I have to sit there and verify everything. It may have done some helpful reasoning for me but the whole point of me asking someone else (or something else) was to do nothing at all…

I’m not sure you can reliably fulfill the first scenario without achieving AGI. Maybe you can, but we are not at that point yet so we don’t know yet.

5 comments

You do need to verify humans work though.

The difference, to me, is that humans seem to be good at canceling each other's mistakes when put in a proper environment.

Not with the same depth. I might ask a friend to drop off a letter and I might verify that they did it, but I don’t have to verify that they didn’t mistake a Taco Bell or a dumpster as the post office.

It’s very scary to ask a friend to drop off a letter if the last scenario is even 1% within the realm of possibility.

My guess is this is an artifact of the RLHF part of the training. Answers like "I don't know" or "let me think and let's catch on this next week" are flagged down by human testers, which eventually trains LLM to avoid this path altogether. And it probably makes sense because otherwise "I don't know" would come up way too often even in cases where the LLM is perfectly able to give the answer.
I don't know, that seems like a fundamental limitation. LLMs don't have any ability to do reflection on their own knowledge/abilities.
Humans aren't very aware of their limits, either.

Even the Dunning-Kruger effect is, ironically, widely misunderstood by people who are unreasonably confident about their knowledge.

But you know if you have ever heard about call by name or value semantics.
You've only ever seen people get upset about technical jargon they know they don't understand, but also never seen people misuse jargon wildly?

The latter in particular is how I model the mistakes LLMs made, what with them having read most things.

Yes, Dunning-Kruger's paper never found what popular science calls the 'Dunning-Kruger' effect.

Effectively, they found nothing real but a statistical artifact.

> Right now I can ask an (experienced) human to do something for me and they will either just get it done or tell me that they can’t do it.

Finding reliable honest humans is a problem governments have struggled with for over a hundred years. If you have cracked this problem at scale you really need to write it up! There are a lot of people who would be extremely interested in a solution here.

> Finding reliable honest humans is a problem governments have struggled with for over a hundred years.

Yes, though you are downplaying the problem a lot. It's not just governments, and it's way longer than 100 years.

Btw, a solution that might work for you or me, presumably relatively obscure people, might not work for anyone famous, nor a company nor a government.

It's not clear to me whether AGI is necessary for solving most of the issues in the current generation of LLMs. It is possible you can get there by hacking together CoTs with automated theorem provers and bruteforcing your way to the solution or something like that.

But if it's not enough then maybe it might come as a second-order effect (e.g. reasoning machines having to bootstrap an AGI so then you can have a Waymo taxi driver who is also a Fields medalist)

There are so called "yes-men" who can't say "no" in no situation. That's rooted in their culture. I suspect that AI was trained using their assistance. I mean, answering "I can't do that" is the simplest LLM path that should work often unless they gone out of their way to downrank it.