| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ncarlson 411 days ago

> AI we don’t have a model.

So, some engineers just stumbled upon LLMs and said, "Holy smokes, we've created something impressive, but we really can't explain how this stuff works!"

We built these things. Piece by piece. If you don't understand the state-of-the-art architectures, I don't blame you. Neither do I. It's exhausting trying to keep up. But these technologies, by and large, are understood by the engineers that created them.

5 comments

ijidak 411 days ago

Not true. How the higher level thought is occurring continues to be a mystery.

This is an emergent behavior that wasn’t predicted prior to the first breakthroughs which were intended for translation, not for this type of higher level reasoning.

Put it this way, if we truly understood how LLMs think perfectly we could predict the maximum number of parameters that would achieve peak intelligence and go straight to that number.

Just as we now know exactly the boundaries of mass density that yield a black hole, etc.

The fact that we don’t know when scaling will cease to yield new levels of reasoning means we don’t have a precise understanding of how the parameters are yielding higher levels of intelligence.

We’re just building larger and seeing what happens.

link

twelve40 411 days ago

> if we truly understood how LLMs think perfectly we could predict the maximum number of parameters that would achieve peak

It's a bit of a strange argument to make. We've been making airplanes for 100+ years, we understand how they work and there is absolutely no magic or emergent behavior in them, yet even today nobody can give an instant birth to the perfect-shape airframe, it's still a very long and complicated process of calculations, wind tunnel tests, basically trial and error. It doesn't mean we don't understand how airplanes work.

link

Workaccount2 411 days ago

Fractals are a better representation, a simple equation that iterated upon gives these fantastically complex patterns. Even knowing the equation you could spend years investing why boundaries between unique fractal structures appear where they do, and why they melt from arches to columns and spirals.

In a similar way we know the framework of LLMs, but we don't know the "fractal" that grows from it.

link

ninetyninenine 411 days ago

It’s not a strange argument. You just lack insight.

The very people who build LLMs do not know how it works. They cannot explain it. They admit they don’t know how it works.

Ask the LLM to generate a poem. No one on the face of the earth can predict what poem the LLM will generate nor can they explain why that specific poem was generated.

link

ncarlson 411 days ago

> How the higher level thought is occurring continues to be a mystery. This is an emergent behavior that wasn’t predicted prior to the first breakthroughs which were intended for translation, not for this type of higher level reasoning.

I'm curious what you mean by higher level thought (or reasoning). Can you elaborate or provide some references?

link

ninetyninenine 411 days ago

The analogy that is used to build artificial neural networks is statistical prediction and best fit curve.

All techniques to build AI stem from an understanding of AI from that perspective.

The thing is… That analogy applies to the human brain as well. Human brains can be characterized as a best fit curve in a multi dimensional space.

But if we can characterize the human brain this way does that mean we completely understand the human brains? No. There is clearly another perspective, another layer of abstraction that we don’t fully comprehend. Yes when the human brain is responding to a query it is essentially plugging the input into a curve function and providing an output and even when this is true a certain perspective is clearly missing.

The human brain is clearly different from an LLM. BUT the missing insight that we lack about the human brain is also the same insight we lack about the LLM. Both intelligences can be characterized as a multi dimensional function but we so far can’t understand anything beyond that. This perspective we can't understand or characterize can be referred to as a higher level of abstraction... a different perspective.

https://medium.com/@adnanmasood/is-it-true-that-no-one-actua...

link

ninetyninenine 411 days ago

The engineers who built these things in actuality don’t understand how it works. Literally. In fact you can ask them and they say this readily. I believe the CEO of anthropic is quoted as saying this.

If they did understand LLMs why do they have so much trouble explaining why an LLM produced certain output? Why can’t they fully control an LLM?

These are algorithms running on computers which are deterministic machines that in theory we have total and absolute control over. The fact that we can’t control something running on this type of machine points to the sheer complexity and lack of understanding of the thing we are trying to run.

link

ninetyninenine 410 days ago

Put it this way Carlson. If you were building LLMs if you understood machine learning if you were one of these engineers who work at open ai, you would agree with me.

The fact that you don’t agree indicates you literally don’t get it. It also indicates you aren’t in any way an engineer who works on AI, because what I am talking about here is an unequivocal and universally held viewpoint held by literally the people who build these things.

link

stevenhuang 411 days ago

> But these technologies, by and large, are understood by the engineers that created them.

Simply incorrect. Look into the field of AI interpretability. The learned weights are black boxes, we don't know what goes on inside them.

link

Workaccount2 411 days ago

Models are grown, not built. The ruleset is engineered, the training framework built, but the model itself that grows through training is incredibly dense complexity.

link