| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lappa 359 days ago

This isn't suggesting no one understands how these models are architected, nor is anyone saying that SDPA / matrix multiplication isn't understood by those who create these systems.

What's being said is that the result of training and the way in which information is processed in latent space is opaque.

There are strategies to dissect a models inner workings, but this is an active field of research and incomplete.

1 comments

Azkron 359 days ago

Whatever comes out of any LLM will directly depend upon the data you fed it and which answers your reinforced as correct. There is nothing unknown or mystical about it.

link

richardatlarge 358 days ago

The same could be said of people, revealing the emptiness of this idea. Knowing the process at a mechanism level says nothing about the outcome. Some people output German, some English. It’s sub-mechanisms are plastic and emergent

link