| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by iNic 413 days ago
	The mere token prediction comment is wrong, but I don't think any of the other comments really explained why. Next token prediction is not what the AI does, but its goal. It's like saying soccer is a boring sport having only ever seen the final scores. The important thing about LLMs is that they can internally represent many different complex ideas efficiently and coherently! This makes them an incredible starting point for further training. Nowadays no LLM you interact with will be a pure next token predictor anymore, they will have all gone through various stages of RL, so that they actually do what we want them to do. I think I really feel the magic looking at the "circuit" work by Anthropic. It really shows that these models have some internal processing / thinking that is complex and clever.

1 comments

quonn 413 days ago

> that they can internally represent many different complex ideas efficiently and coherently

The Transformer circuits[0] suggest that this representation is not coherent at all.

[0] https://transformer-circuits.pub

link

iNic 413 days ago

I guess that depends on what you think is coherent. A key finding is that the larger the network the more coherent the representation becomes. One example is larger networks merge the same concept across different languages into a single concept (as humans do). The addition circuits are also fairly easy to interpret.

link

quonn 413 days ago

> merge the same concept

It's doing compression which does not mean it's coherent.

> The addition circuits are also fairly easy to interpret.

The addition circuits make no sense whatsoever. It's doing great at guessing that's all.

link

iNic 408 days ago

I am curious, what would you count as coherent? I think it is absolutely insane that we can open and understand what are essentially alien intelligences at all!

link