|
|
|
|
|
by cjbprime
849 days ago
|
|
An analogy that works without having to explain anything at all about how LLMs actually work (or maybe does explain a lot, depending on how you look at it) could be: * LLMs are lossy compression functions on their training data. * The size of the model dictates how lossy the compression is. * You can't spend compute to get more detail out of a model once it's been compressed/trained, anymore than you can spend compute to get an incredibly lossily-compressed movie to go from 240p back to the original 1080p source. |
|
Similarly, LLMs can produce better answers if you teach them thinking strategies that remind them to put the available evidence and intermediate steps in their context window. Otherwise they'll tend to hallucinate an answer out of vaguely correct words.