Hacker News new | ask | show | jobs
What's the best explanation for LLMs being almost right when citing sources?
14 points by akashtndn 740 days ago
For example, I asked GPT-4 to identify the source for a quote from a short story by Edgar Allan Poe. It correctly identified the author but not the title. What's the best available explanation for such behavior by LLMs?
4 comments

LLMs are lossy compressors. They aren't databases. The training process had enough signal to allow for recognizing the author, but not for memorizing the title.
LLMs are auto-regressive predictors -- so they take the text given to them (i.e., the prompt) and generate a probability estimate for what the next token should be.

Suppose you gave it a quote -- "Once upon a midnight dreary, while I pondered, " and ask it to keep writing, it will generate a probability distribution across various tokens it has been trained on.

I'll use words here, rather than tokens, to make the point... Hypothetically, for the quote above, the LLM might estimate the probability of the next word being...

"Weak" = 0.80

"Tired" = 0.10

"Slothful" = 0.05

... and so on.

Now, if you are using a temperature of 0.0, the LLM will pick the highest probability word/token. It's possible you had a non-zero temperature setting and the LLM "knew" the right answer but randomly picked the wrong one... Temperature basically randomizes the token choice to make it more diverse/creative/better.

Alternatively, based on the prompt (i.e., collection of text you put in), it estimated the probability of the wrong answer to be higher. In your case, the LLM likely had a high probability for "Edgar Allan Poe" but maybe a lower probability for the specific works/titles, and hence chose incorrectly.

P.S.

If you are using the OpenAI playground, you can actually get the probability estimates, if you want to investigate further!

Would you happen to know of a good resource that shows an implementation of an LLM so I can see this more concretely in practice?

The idea of an LLM makes sense: it's ultimately a graph with statistical weights for which node we use to generate text next (my understanding here is still correct, yes?).

My issue is, where does the "learning" part come in? It feels like it's all hard-coded but I know it's not. What allows for flexibility in token generation besides that randomization from temperature that you mentioned?

You're not wrong! After training, the neural network is in fact "hard coded" in that the weights do not change anymore.

During training, you update the weights. As a very, very simple example in relation to my original response above... Suppose you are training a neural network. You know that the probability of "weary" coming after "Once upon a midnight dreary, while I pondered, " is 1.00, right? So you set up your LLM with random weights, and you see that the probability estimate for "weary" is 0.77 (I'm making this # up)... Well you can now use an algorithm to nudge the neural network's weights in a way that makes the 0.77 closer to 1.00!

In other words -- the learning happens during training.

A good place to start might be this course: https://karpathy.ai/zero-to-hero.html

Depending on your sophistication and amount of free time you have, here are more resources: https://phaseai.com/resources/free-resources-ai-ml-2024

Good luck!

100% vibes-based guess: there could be more training data that includes part of the quoted material and cites it to "Poe" without IDing the specific work, versus full citations.
Prompt engineering skill issue.
The prompt isn't relevant to this question though. The quality of output can be improved with better input but in this case, I am curious about the underlying mechanics in the model that leads to such behavior.
I had a prompt saved which would give full sources per sentence of response. It was useful for one purpose then became annoying and time consuming. I was diagnosing hallucinations and training data issues.

Maybe crafting something to give a full APA or MLA citation and works cited page per response could help.

https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00563...

Perhaps... but "prompt engineering" right now seems like throwing paint at a wall until the black box evaluates to the relative "truth" you were looking for. It's like a stochastic wrench you turn until it serves your intensive porpoises.

Tokens? The way I see these things operating (in my head) is as a hyper-dimensional merge sort which lose there context/bounded-domain during evaluation, leading to something less than the sum of its parts because the weights between tokens correlate linguistic/phonetic relationships--which lose their causal-relationship to the real world.

Uhm… How could it be that?
Mate, sometimes your starting point is so wrong that it is impossible to formulate an answer to the question inherent in your response.