| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by davibu 1276 days ago

The criticisms regarding chatGPT remind me of what was said about Wikipedia at its very beginning, that it was supposedly unreliable. I think we will have a good laugh in a few years reading these first comments.

There is no doubt that chatGPT is the future. It is certainly perfectible, but the existing basis is a revolution in progress.

In my opinion, there are two essential things missing for chatGPT to become the perfect replacement for Wikipedia and Google: - The ability to activate a "system 2" or slow thinking (theorized by Daniel Kahneman) - The ability to cite sources

And the cherry on the cake would be the ability to interact with images

5 comments

HarHarVeryFunny 1276 days ago

I think the BS-generation problem with ChatGPT goes far deeper than citing sources, for a variety of reasons.

1) It's not a search engine, even if it behaves a bit like one. It's not "retrieving answers" to your questions (from sources that it could choose to cite). ChatGPT is really just a "language model", so it has no notion that what you're typing is even a question/query .. your input is just treated as sequence of words (which ChatGPT has zero understanding of), with ChatGPT's response then being a further sequence of words that it has calculated are (one) statistically probable continuation of what you typed (you can keep asking it for alternative answers, and it'll continue generating additional alternative statistically probable continuations).

The websites/etc that ChatGPT was trained on are just sources of language that it consumed in order to learn the statistics that let it make these continuation predictions. It's not memorizing "facts" from websites, just word statistics, and these are mixed in with the statistics from all the other sources it was trained on. If it generates the word "walk" as part of a response, it can't cite a source for that since there essentially is none - only a bazillion text sources it was trained on that collectively made the word "walk" a high probability continuation on the words it had generated leading up to that...

2) Even if ChatGPT had been designed to deal in "facts" (rather that words statistics) associated with specific sources, the bullshit problem isn't just knowing the varied reliability of the sources it was trained on, but how those "facts" are combined. To combine multiple facts and correctly deduce something new from them would require intelligence, but ChatGPT doesn't have any intelligence - it's just a statistical word generator, so the way it combines snippets from different sources is again just statistical word generation, with zero knowledge of the meaning of the words it is generating or whether it makes sense!

What makes ChatGPT seem semi-intelligent is that a lot of what it was trained on was text written by semi-intelligent humans, so the "sequence of words" it is generating, following the statistics of human speech, seems like something a human might say... until you start paying attention to the meaning of the words and realize it's often good-sounding garbage.