Hacker News new | ask | show | jobs
by maxilevi 152 days ago
LLMs are just really good search. Ask it to create something and it's searching within the pretrained weights. Ask it to find something and it's semantically searching within your codebase. Ask it to modify something and it will do both. Once you understand its just search, you can get really good results.
8 comments

I agree somewhat, but more when it comes to its use of logic - it only gleans logic from human language which as we know is a fucking mess.

I've commented before on my belief that the majority of human activity is derivative. If you ask someone to think of a new kind of animal, alien or random object they will always base it off things that they have seen before. Truly original thoughts and things in this world are an absolute rarity and the majority of supposed original thought riffs on what we see others make, and those people look to nature and the natural world for inspiration.

We're very good at taking thing a and thing b and slapping them together and announcing we've made something new. Someone please reply with a wholly original concept. I had the same issue recently when trying to build a magic based physics system for a game I was thinking of prototyping.

  it only gleans logic from human language
This isn’t really true, at least how I interpret the statement, little if any of the “logic” or appearance of such is learned from language. It’s trained in with reinforcement learning as pattern recognition.

Point being it’s deliberate training, not just some emergent property of language modeling. Not sure if the above post meant this, but it does seem a common misconception.

LLMs lack agency in the sense that they have no goals, preferences, or commitments. Humans do, even when our ideas are derivative. We can decide that this is the right choice and move forward, subjectively and imperfectly. That capacity to commit under uncertainty is part of what agency actually is.
But they do have utility functions, which one can interpret as nearly equivalent
better mental model: it's a lossy compression of human knowledge that can decompress and recombine in novel (sometimes useful, sometimes sloppy) ways.

classical search simply retrieves, llms can synthesize as well.

Corporate wants you to find the difference...

Point being, in broad enough scope, search and compression and learning are the same thing. Learning can be phrased as efficient compression of input knowledge. Compression can be phrased as search through space of possible representation structures. And search through space of possible X for x such that F(x) is minimized, is a way to represent any optimization problem.

This isn't strictly better to me. It captures some intuitions about how a neural network ends up encoding its inputs over time in a 'lossy' way (doesn't store previous input states in an explicit form). Maybe saying 'probabilistic compression/decompression' makes it a bit more accurate? I do not really think it connects to your 'synthesize' claim at the very end to call it compression/decompression, but I am curious if you had a specific reason to use the term.
It's really way more interesting that that.

The act of compression builds up behaviors/concepts of greater and greater abstraction. Another way you could think about it is that the model learns to extract commonality, hence the compression. What this means is because it is learning higher level abstractions AND the relationships between these higher level abstractions, it can ABSOLUTELY learn to infer or apply things way outside their training distribution.

ya, exactly... i'd also say that when you compress large amounts of content into weights and then decompress via a novel prompt, you're also forcing interpolation between learned abstractions that may never have cooccurred in training.

that interpolation is where synthesis happens. whether it is coherent or not depends.

Maybe the base model is just a compression of the training data?

There is also a RLHF training step on top of that

yep the base model is the compression, but RLHF (and other types of post training) doesn't really change this picture, it's still working within that same compressed knowledge.

nathan lambert (who wrote the RLHF book @ https://rlhfbook.com/ ) describes this as the "elicitation theory of post training", the idea is that RLHF is extracting and reshaping what's already latent in the base model, not adding new knowledge. as he puts it: when you use preferences to change model behavior "it doesn't mean that the model believes these things. it's just trained to prioritize these things."

so like when you RLHF a model to not give virus production info, you're not necessarily erasing those weights, the theory is that you're just making it harder for that information to surface. the knowledge is still in the compression, RLHF just changes what gets prioritized during decompression.

No, this describes the common understanding of LLMs and adds little to just calling it AI. The search is the more accurate model when considering their actual capabilities and understanding weaknesses. “Lossy compression of human knowledge” is marketing.
It is fundamentally and provably different than search because it captures things on two dimensions that can be used combinatorially to infer desired behavior for unobserved examples.

1. Conceptual Distillation - Proven by research work that we can find weights that capture/influence outputs that align with higher level concepts.

2. Conceptual Relations - The internal relationships capture how these concepts are related to each other.

This is how the model can perform acts and infer information way outside of it's training data. Because if the details map to concepts then the conceptual relations can be used to infer desirable output.

(The conceptual distillation also appears to include meta-cognitive behavior, as evidenced by Anthropic's research. Which manes sense to me, what is the most efficient way to be able to replicate irony and humor for an arbitrary subject? Compressing some spectrum of meta-cognitive behavior...)

Aren't the conceptual relations you describe still, at their core, just search (even if that's extremely reductive)? We know models can interpolate well, but it's still the same probabilistic pattern matching. They identify conceptual relationships based on associations seen in vast training data. It's my understanding that models are still not at all good at extrapolation, handling data "way outside" of their training set.

Also, I was under the impression LLM's can replicate irony and humor simply because that text has specific stylistic properties, and they've been trained on it.

I don't know honestly, I think really the only big hole the current models have is if you have tokens that never get exposed enough to have a good learned embedding value. Those can blow the system out of the water because they cause activation problems in the low layers.

Other than that the model should be able to learn in context for most things based on the component concepts. Similar to how you learn in context.

There aren't a lot of limits in my experience. Rarely you'll hit patterns that are too powerful where it is hard for context to alter behavior, but those are pretty rare.

The models can mix and match concepts quite deeply. Certainly, if it is a completely novel concept that can't be described by a union or subtraction between similar concepts, than the model probably wouldn't handle it. In practice, a completely isolated concept is pretty rare.

Information Retrieval followed by Summarization is how I view it.
“Novel” to the person who has not consumed the training data. Otherwise, just training data combined in highly probable ways.

Not quite autocomplete but not intelligence either.

What is the difference between "novel" and "novel to someone who hasn't consumed the entire corpus of training data, which is several orders of magnitude greater than any human being could consume?"
The difference is that when you do not know how a problem can be solved, but you know that this kind of problem has been solved countless times earlier by various programmers, you know that it is likely that if you ask an AI coding assistant to provide a solution, you will get an acceptable solution.

On the other hand, if the problem you have to solve has never been solved before at a quality satisfactory for your purpose, then it is futile to ask an AI coding assistant to provide a solution, because it is pretty certain that the proposed solution will be unacceptable (unless the AI succeeds to duplicate the performance of a monkey that would type a Shakespearean text by typing randomly).

Are you reviewer 2?

Joking aside, I think you have too strict of a definition of novel. Unfortunately "novel" is a pretty vague word and is definitely not a binary one.

ALL models can produce "novel" data. I don't just mean ML (AI) models, but any mathematical model. The point of models is to make predictions about results that aren't in the training data. Doing interpolation between two datapoints does produce "novel" things. Thinking about the parent's comment, is "a blue tiger" novel? Probably? Are there any blue tigers in the training data? (there definitely is now thanks to K-Pop Demon Hunters) If not, then producing that fits the definition of novel. BUT I also agree that that result is not that novel. It is entirely unimpressive.

I'm saying this not because I disagree with what I believe you intend to say but because I think a major problem with these types of conversations is that many people are going to interpret you more literally and dismiss you because "it clearly produces novel things." It isn't just things being novel to the user, though that is also incredibly common and quite telling that people make such claims without also checking Google...

Speaking of that, I'm just going to leave this here... I'm still surprised this is a real and serious presentation... https://www.youtube.com/watch?v=E3Yo7PULlPs&t=616s

Citation needed that grokked capabilities in a sufficiently advanced model cannot combinatorially lead to contextually novel output distributions, especially with a skilled guiding hand.
Pretty sure burden of proof is on you, here.
It's not, because I haven't ruled out the possibility. I could share anecdata about how my discussions with LLMs have led to novel insights, but it's not necessary. I'm keeping my mind open, but you're asserting an unproven claim that is currently not community consensus. Therefore, the burden of proof is on you.
I agree that after discussions with a LLM you may be led to novel insights.

However, such novel insights are not novel due to the LLM, but due to you.

The "novel" insights are either novel only to you, because they belong to something that you have not studied before, or they are novel ideas that were generated by yourself as a consequence of your attempts to explain what you want to the LLM.

It is very frequent for someone to be led to novel insights about something that he/she believed to already understand well, only after trying to explain it to another ignorant human, when one may discover that the previous supposed understanding was actually incorrect or incomplete.

I really don’t think search captures the thing’s ability to understand complex relationships. Finding real bugs in 2000 line PRs isn’t search.
This is not true.
Im not sure how anyone can say this. It is really good search, but its also able to combine ideas and reason about and do fairly complex logic on tasks surely absolutely no one has asked before.
Its a very useful model but not a complete one. You just gotta acknowledge that if you're making something new its gonna take all day and require a lot of guard rails, but then you can search for that concept later (add the repo to the workspace and prompt at it) and the agent will apply it elsewhere as if it was a pattern in widespread use. "Just search" doesn't quite fit. I've never wondered how best to use a search engine to make something in a way that will be easily searchable later.
Calling it "just search" is like calling a compiler "just string manipulation". Not false, but aggressively missing the point.
No, “just search” is correct. Boosters desperately want it to be something more, but it really is just a tool.
Yes, it is a tool. No, it is not "just search".

Is your CPU running arbitrary code "just search over transistor states"?

Calling LLMs "just search" is the kind of reductive take that sounds clever while explaining nothing. By that logic, your brain is "just electrochemical gradients".

I mean, actually not a bad metaphor, but it does depend on the software you are running as to how much of a 'search' you could say the CPU is doing among its transistor states. If you are running an LLM then the metaphor seems very apt indeed.
What would you add?

To me it's "search" like a missile does "flight". It's got a target and a closed loop guidance, and is mostly fire and forget (for search). At that, it excels.

I think the closed loop+great summary is the key to all the magic.

Which is kind of funny because my standard quip is that AI research, beginning in the 1950s/1960s, and indeed much of late 20th century computer tech especially along the Boston/SV axis, was funded by the government so that "the missile could know where it is". The DoD wanted smarter ICBMs that could autonomously identify and steer toward enemy targets, and smarter defense networks that could discern a genuine missile strike from, say, 99 red balloons going by.
It's a prediction algorithm that walks a high-dimensional manifold, in that sense all application of knowledge it just "search", so yes, you're fundamentally correct but still fundamentally wrong since you think this foundational truth is the end and beginning of what LLMs do, and thus your mental model does not adequately describe what these tools are capable of.
Me? My mental model? I gave an analogy for Claude not a explanation for LLMs.

But you know what? I was mentally thinking of both deep think / research and Claude code, both of which are literally closed loop. I see this is slightly off topic b/c others are talking about the LLM only.

i dont disagree, but i also dont think thats an exciting result. every proboem can be described as a search for the right SOP, followed by execution of that SOP.

an LLM to do the search, and the agent to execute the instructions can do everything under the sun

I don't mean search in the reductionist way but rather that its much better at translating, finding and mapping concepts if everything is provided vs creating from scratch. If it could truly think it would be able to bootstrap creations from basic principles like we do, but it really can't. Doesn't mean its not a great powerful tool.
> If it could truly think it would be able to bootstrap creations from basic principles like we do, but it really can't.

alphazero?

I just said LLMs
You are right that LLM and alphazero are different models, but given that alphazero demonstrated having the ability to bootstrap creations, we can't easily rule out LLM also has this ability?
This doesn’t make sense. They are fundamentally different things, so an observation made about Alphazero does not help you learn anything about LLMs.

  > Once you understand its just search, you can get really good results.
I think this is understating the issue, ignoring context. It reminds me of how easy people claim searching is with search engines. But there's so many variables that can make results change dramatically. Just like Google search, two people can type in the exact same query and get very different results. But probably the bigger difference is in what people are searching for.

What's problematic with these types of claims is that they just come off as calling anyone who thinks differently dumb. It's as disconnected as saying "It's intuitive" in one breath and "You're holding it wrong" in another. It's a bad mindset to be in as an engineer because someone presents a problem and instead of trying to address it is dismissed. If someone is holding it wrong, it probably isn't intuitive[0]. Even if they can't explain the problem correctly, they are telling you a problem exists[1]. That's like 80% of the job of an engineer: figuring out what the actual problem is.

As maybe an illustrative example people joke that a lot of programming is "copy pasting from stack overflow". We all know the memes. There's definitely times where I've found this to be a close approximation to writing an acceptable program. But there's many other times where I've found that to be far from possible. There's definitely a strong correlation to what type of programming I'm doing, as in what kind of program I'm writing. Honestly, I find this categorical distinction not being discussed enough with things like LLMs. Yet, we should expect there to be a major difference. Frankly, there are just different amounts of information on different topics. Just like how LLMs seem to be better with more common languages like Python than less common languages (and also worse at just more complicated languages like C or Rust).

[0] You cannot make something that's intuitive to all people. But you can make it intuitive for most people. We're going to ignore the former case because the size should be very small. If 10% of your users are "holding it wrong" then the answer is not "10% of your users are absolute morons" it is "your product is not as intuitive as you think." If 0.1% of your users are "holding it wrong" then well... they might be absolute morons.

[1] I think I'm not alone in being frustrated with the LLM discourse as it often feels like people trying to gaslight me into believing the problems I experience do not exist. Why is it so surprising that people have vastly differing experiences? *How can we even go about solving problems if we're unwilling to acknowledge their existence?*