| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Balinares 260 days ago
	That's a good read also shared by another poster above, thanks! If I'm reading this right, it contextualizes, but doesn't negate the findings from that paper. I've got a major aesthetic problem with the fact LLMs require this much training data to get where they are, namely, "not there yet"; it's brute force by any other name, and just plain kind of vulgar. Although more importantly it won't scale much further. Novel architectures will have to feature in at some point, and I'll gladly take any positive result in that direction.

1 comments

ACCount37 260 days ago

Evolution is brute force by any other name. Nothing elegant about it. Nonetheless, here you are.

Poor sample efficiency of the current AIs is a well known issue - but you should keep in mind what kind of grisly process was required to give you the architecture that makes you as sample efficient as you are.

We don't know yet what kind of architectural quirks enable this sample efficiency in the human brain. It could be something like a non-random initialization process that confers the right inductive biases, a more efficient optimizer, recurrent background loops... or just more raw juice.

It might be that one biological neuron is worth 10000 LLM weights, and a big part of how the brain is so sample efficient is that it's hilariously overparametrized.

link

nurettin 260 days ago

Brute force:

    for i in 1..99999999:
        if i == 66666654:
             print(i)
             break

GA:

    for g in 1..100:
        pop, best = crossover(tournament(pop, heuristic_fn))
        print(best.value)
        if best.fitness < 0.01:
            break

GA uses a heuristic to converge. If that is brute force, so is binary search.

link

maleldil 260 days ago

> If that is brute force, so is binary search.

Binary search is guaranteed to find the target if it exists, so it's not a heuristic. GA isn't, as it can get stuck in local minima. However, I agree that GA isn't brute force.

link

nurettin 260 days ago

Heuristic just means there is a function telling you where to go. For A* it is the goal, for binary search it is lte, for geadient descent it is adam.

link

sdenton4 260 days ago

Yeaaaaaah, I kinda doubt there's much coming from evolutionary biases.

If it's a matter of clever initialization bias, it's gotta be pretty simple to survive the replication via DNA and procedural generative process in the meat itself, alongside all of the other stuff which /doesn't/ differentiate us from chimpanzees. Likely simple enough that we would just find something similar ourselves through experimentation. There's also plenty of examples of people learning Interesting Unnatural Stuff using their existing hardware (eg, echolocation, haptic vision, ...) which suggests generality of learning mechanisms in the brain.

link

ACCount37 260 days ago

The brain implements some kind of fairly general learning algorithm, clearly. There's too little data in the DNA to wire up 90 billion neurons the way we can just paste 90 billion weights into a GPU over a fiber optic strand. But there's a lot of innate scaffolding that actually makes the brain learn the way it does. Things like bouba and kiki, instincts, all the innate little quirks and biases - they add up to something very important.

For example, we know from neuroscience that humans implement something not unlike curriculum learning - and a more elaborate version of it than what we use for LLMs now. See: sensitive periods. Or don't see sensitive periods - because if you were born blind, but somehow regained vision in adulthood, it'll never work quite right. You had an opportunity to learn to use the eyes well, and you missed it.

Also, I do think that "clever initialization" is unfortunately quite plausible. Unfortunately - because yes, it has to be simple enough to be implemented by something like a cellular automata, so the reason why we don't have it already is that the search space of all possible initializations a brain could implement is still extremely vast and we're extremely dumb. Plausible - because of papers like this one: https://arxiv.org/abs/2506.20057

If we can get an LLM to converge faster by "pre-pre-training" it on huge amounts of purely synthetic, algorithmically generated meaningless data? Then what are the limits of methods like that?

link

alganet 260 days ago

> Evolution is brute force by any other name.

No, it's not.

link