Hacker News new | ask | show | jobs
by endorphine 1174 days ago
I can't understand how something called "a language model" can evaluate some input text. ChatGPT4 seems like it understands the text of the first assistance and properly assesses that it wasn't actually a description of the trip.

Similarly, the other day I asked it to explain a hard-to-understand philosophy quote and it explained it properly (I believe).

Can someone explain how they do this? It seems magical.

2 comments

Take a look at https://github.com/jaymody/picoGPT/blob/a750c145ba4d09d57648...

Yes, this is GPT-2 not 4 and it‘s not the RL-trained Chat, only the GPT model and it‘s basically only the inference part, not the training loop and it‘s somewhat simplified.

Still, take a good look.

That‘s essentially what it is on a single sheet of paper.

There is nothing specifically about language in „language model“, we just call it that. Better to call it just LLM.

Nobody knows exactly what it learns, although there would be ways to poke around given some research programs. But it seems like the interest in that is limited currently, everyone is busy with improving it or with applications.

Perhaps the answer is that we overestimated what a mind is. It‘s like we used to ask what life is and it turned out that there is nothing special about life, not even the DNA is controlling anything. It‘s merely a chemical process, even though a complex process.

Personally, I believe there is something special about consciousness and I believe these systems are not conscious. And I believe they are currently not flexible enough to adapt to new circumstances, unlike a human brain. They don‘t currently make much real progress. But that may change.

Either it's a lifeless recombination of text from the Internet, or an emergent understanding achieved through working with trillions of text tokens, despite lacking access to the full range of the human sensorium informing that text.

The jury is still out.