Hacker News new | ask | show | jobs
by bravura 816 days ago
I have another explanation. LLMs are essentially trained on "A B", i.e. is it plausible that B follows A.

There's simply a much larger space of possibilities for shorter completions, A B1, A B2, etc. that are plausible. Like if I ask you to give a short reply to a nuanced question, you could reply with a thoughtful answer, a plausible superficially correct sounding answer, convincing BS, etc.

Whereas if you force someone to explain their reasoning, the space of plausible completions reduces. If you start with convincing BS and work through it honestly, you will conclude that you should reverse. (This is similar to how one of the best ways to debunk toxic beliefs with honest people is simply through openly asking them to play out the consequences and walking through the impact of stuff that sounds good without much thought.)

This is similar to the reason that loading your prompt with things that reduce the space of plausible completions is effective prompt engineering.

5 comments

> This is similar to how one of the best ways to debunk toxic beliefs with honest people is simply through openly asking them to play out the consequences and walking through the impact of stuff that sounds good without much thought.

Actually, one of the best ways is pretending to be more extreme than them. Agree with them on everything, which is disarming, but then take it a step or two even further. Then they're like, "now hang on, what about X and Y" trying to convince you to be more reasonable, and pretty soon they start seeing the holes and backtrack to a more reasonable position.

https://www.pnas.org/doi/abs/10.1073/pnas.1407055111

I think you're right. I would go a step further and say that all learning is roughly synonymous with reducing the output space, and that humans do the exact same thing. There are more ways to get the wrong answer to a math problem than there are to get the right answer. When you learn someone's name, you're narrowing your output to be a single name rather than all plausible names.

The output of a generative model is practically infinite. I suspect it's possible to continually narrow the space of completions and never converge on a single output. If this turns out to be true, it would bode well for the scalability of few-shot learning.

It helps, but it still gets stuck in local optima based on what it started with. I've never seen it turn around and correct its faulty reasoning unless it tried to actually run the code and observed an Exception. If I respond with "but have you considered XYZ?", my leading question will usually cause it to correct itself, even when it wasn't incorrect.

We need some way to generate multiple independent thoughts in parallel. Each separate thought is constructed using chain of thought to improve the reliability. Then you have some way to "reduce" these multiple thoughts into a single solution. The analogy would be a human brainstorming session where we try to attack the same problem from multiple angles and we try to decorrelate each idea/approach.

We already have that, it's called beam decoding, and there are three of thought solutions as well, for each beam you can pick the one with the best logprob, but it's not a given that the result will be better because logprob only capture the model decisiveness not correctness, so it'll still fail if a model is confidently wrong.
I think this is different, because you could include tool use in the branches. E.g.

1. rewrite the following question in five different ways.

2. For each version of the question, write python code to do the work.

3. Look at all the outputs, write an answer

I was going to write pretty much this exact same comment. I am an amateur in how LLMs work, definitely, but I always thought this was the plausible explanation.

If I want the "assistant "LLM to tell me "How much 5 times 2 is", if I feed it the line "5 * 2 = " as if it's already started giving that answer, it will very likely write 5*2 = 10.

Since LLMs operate on semantic relationships between tokens, the more a bunch of tokens are "close" to a given "semantic topic", the more the LLM will keep outputting tokens in that topic. It's the reason why if you ask an LLM to "review and grade poetry", eventually it starts saying the same thing even about rather different poems -- the output is so filled with the same words, that it just keeps repeating them.

Another example:

If I ask the LLM to solve me a riddle, just by itself, the LLM may get it wrong. If, however, I start the answer, unravelling a tiny bit of the problem it will very likely give the right answer, as if it's been "guided" onto the right "problem space".

By getting LLMs to "say" how they are going to solve things and checking for errors, each words basically tugs onto the next one, honing in on the correct solution.

In other words:

If an LLM has to answer a question -- any question --, but right after we ask the question we "populate" its answer with some text, what text is more likely to make the LLM answer incorrectly?

- Gibberish nonsense

- Something logical and related to the problem?

Evidently, the more gibberish we give to it, the more likely it is to get it wrong, since we're moving away from the "island of relevant semantic meaning", so to speak. So if we just get the LLM to feed itself more relevant tokens, it automatically guides itself to a better answer. It's kind of like there's an "objective, ideal" sequence of tokens, and it can work as an attractor. The more the LLM outputs words, the more it gets attracted to that sequence...that...."island of relevant semantic meaning".

But, again, I know nothing of this. This is just how I view it, conceptually. It's probably very wrong.

That reminds me ... You know how LLMs have a hard time being corrected? If I ask it not to format responses as bullet lists, after 1-2 rounds it does it again. Why? Because the context is filled with examples where it has used bullet lists, and it acts like an attractor.

I ask it not to start phrases with "However..." and it does it again. Maybe just having the word However in the prompt acts like an attractor that compels the LLM to use it, even when I actually asked the opposite. Probably also the fault of heavy handed RLHF telling it to balance any user position with the opposite take.

This is one of many ways of LLMs are being crippled by terrible UI controls. You can't do simple things like edit the conversation history to make it forget things.
You can edit the conversation history though. You need to try alternative apps/UIs instead of the product websites like ChatGPT. Those are only for collecting more training data from users instead of being the most useful interface possible.
if you haven't already, I recommend trying the openai playground instead of chatgpt. It is the same underlying ai (i.e. gpt4), but you have much more control over the inputs.

Bonus 1: Since you pay per token, it's much cheaper than a chatgpt abo

Bonus 2: You can increase the context window dramatically (iirc 8000 being the max for playground, while 2000 is the max for chatgpt)

Facebook had a paper about "system 2" LLM attention, where they identified which parts of the input would be distracting for the LLM and just deleted them.

https://arxiv.org/abs/2311.11829

Using a 3rd party interface to the LLMs (like typingmind.com) is both better and cheaper than using chatgpt.
> This is similar to the reason that loading your prompt with things that reduce the space of plausible completions is effective prompt engineering.

And this is why taking your time to write a detailed software help request delivers a good chance that you will solve your problem all by your lonesome.

A rubber duck is all you need.
Yes, my fear of stack overflow moderators has caused me to solve many problems before I even finish writing the question.