| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wangii 770 days ago

I feel it's a pretty dangerous optimization before we REALLY understand what's going on inside of the LLM. e.g. guys believe in the geometric interpretation will have something to say, and it would probably hurt if you are using "filler" tokens.

Besides, the assumption (not a universal fact) that "forming complete sentences in mind before articulating word by word" seems overly simplifies activities happens in our mind: do we really have a complete planning before start talking/typing? as a Buddhist I lean towards it's an illusion. further more, what about simultaneous thoughts? are we linear thinker in the sentence level?

anyway, pretty neat math!

5 comments

renonce 770 days ago

The optimization does not affect the result of LLM, it's guaranteed to produce equivalent results as decoding directly. Let's not treat that LLM as some magic that resembles our mind, it's just another program that produces sentences that happens to make sense.

naasking 770 days ago

> Let's not treat that LLM as some magic that resembles our mind,it's just another program that produces sentences that happens to make sense.

"That happen to make sense" is hiding a lot of magic. It would be statistically impossible to make as much sense as LLMs do in response to prompts if it did not actually make semantic distinctions. If it makes semantic distinctions, then it does resemble the human mind in at least one way.

wangii 770 days ago

According to the original Jacobi decoding paper, it's set in the machine translation tasks, with encoder + decoder, in which parallel algo applied only to the decoder part.

sigmoid10 770 days ago

Lets not treat our mind as something magical. It's just another program that learned to speak by consuming lots of training input. The implementation might look slightly different from the outside, but from a mathematical perspective, artificial neural networks are proven to be at least as capable as the human mind.

baq 770 days ago

The best part is, your comment works both when sarcastic and completely serious.

ben-schaaf 769 days ago

> artificial neural networks are proven to be at least as capable as the human mind

Do you have a source for this? I know we have models of neural networks designed to act like neurons, but those aren't what're being used.

sigmoid10 766 days ago

See the universal approximation theorem for fully connected perceptrons.

ben-schaaf 763 days ago

That's really nowhere near enough of a proof. You'd need to prove that a human brain is equivalent to a mathematical function, and that that function can be sufficiently approximated by a NN to be functionally identical.

Additionally UAT doesn't actually prove NNs can approximate any function. Non-continuous functions and infinitely large domains aren't covered.

xpe 769 days ago

Define ‘capable’ and most of the confusion and potential controversy goes away.

Etheryte 770 days ago

That assumption might be useful in this context, but I think it's pretty clearly not true. Ask anyone to tell you about a complex past event with a lot of parallel branches and you'll quickly see them add bits, pieces and tangents midsentence to cover the full range of events. I don't think I've seen the sentence granularity hypothesis in any serious scientific context before.

hatthew 769 days ago

Can't speak for everyone but I definitely don't mentally form complete sentences before talking. Sometimes I grammatically talk myself into a corner in the middle of a sentence and need to use some awkward words/phrases to finish my thought, or simply pause and restart the phrase from the beginning.

nomel 769 days ago

I feel surprisingly disconnected from my speaking self, acting as more of an observer, who is sometimes surprised at what I come up with. It just flows. I feel I have very little need for input.

But, I also feel fairly disconnected from my thinking self. I point my attention at something and solutions usually just pop out, maybe with some guidance/context forming required, in the form of internal dialog, which is usually of a rubber ducky style format [1], or mental testing of that mostly spontaneous solution.

I feel the "real" me is the one sensing/observing, which includes the observing of those spontaneous solutions, and what I say.

[1] Works with any problem space, not just coding "debugging": https://rubberduckdebugging.com/

wangii 769 days ago

are you practicing any meditation? it's regarded as "awaken" state in some practice! if you have any method, please share with me! thanks!

int_19h 769 days ago

We don't appear to be forming words sequentially from underlying parts, even though in many languages they are broken down in smaller units that carry semantic meaning themselves. There doesn't seem to be any clear reason for this to break down suddenly at sentence level.

causal 770 days ago

What is the geometric interpretation?