Hacker News new | ask | show | jobs
by seba_dos1 1101 days ago
ChatGPT has no knowledge about these filters, so all it can say about them is pure hallucination. It may still guess correctly and probably does, but you could ask any random person with enough background knowledge to make it up instead and get the same value.
5 comments

Heh, when I ask why it's cut off, this is the response:

>The partial completion of the Litany of Fear in the previous response occurred because of the model's tendency to generate text in chunks or segments, typically up to a certain token limit. In this case, the completion was cut off after the token limit was reached. GPT-3.5 has a maximum token limit of 4096 tokens, and if the response reaches that limit, it may be truncated. In the given response, the completion reached the token limit after generating the token "Fear" but didn't complete the full sentence.

This response is certainly incorrect: had the token limit really be reached, no more text could come out. Not the Litany of Fear, nor this explanation.
what? You can definitely continue a conversation after the token limit is reached, you just need to provide it a reply, and the context might be pruned at some point.
TIL. I thought after reaching the max token size ChatGPT would just refuse to produce more output.

Still, the explanation given by ChatGPT about why the Litany of Fear was cut off is incorrect, when it says: "In this case, the completion was cut off after the token limit was reached".

When I asked it, it decided it was probably because it was copyright infringement. Once that theory was in the context window, it just kept doubling down on it.
It baffles me how many people are using ChatGPT to get answers: it's a “language model” folks, not a “knowledge model”.
There is a group of people that are ardent in the conviction that if we wait long enough and wish hard enough that ChatGPT will replace the concept of googling things.

I’m personally open to that being the case with some LLM someday, but there’s a lot of people ideologically and financially invested in that perception being accurate today and applying to ChatGPT in particular.

People hate it when you point out that they’ve confused the modern equivalent of a Speak and Spell with The Overmind.

To a large extent this is already viable.

Yes, ChatGPT doesn't know what it's talking about. But Google is so bad these days, that ChatGPT in many circumstances can do much better even while being far from perfect.

It doesn't know what it's talking about, but it is still stating it in a very convincing way. To me, that's even worse than bad search results which you can instantly recognize as bad.
>But Google is so bad these days, that ChatGPT in many circumstances can do much better even while being far from perfect.

Can you give some examples of how ChatGPT “does better” than a human searching for something and then using their own cognition to make sense of it?

If Google was an LLM then I’d agree with you, but the idea that googling something somehow skips the step of using critical thinking to understand what is on your screen seems pretty new to me.

What percentage of searches actually require any critical thinking at all? The Internet is not always a fountain of intellectual growth, sometimes it's just a tool.

I'll give you a concrete example where it does better than search: my last Google search was about how to zip a whole folder in the Linux terminal (I must have searched for that dozens of times in my life, but I don't do it very often so it doesn't stick).

I saw the results, and I noticed that they were mostly transcriptions, summaries or extensions of man pages listing all the options, which would take a few minutes to skim through and look for the relevant options (which is what I wanted to avoid by Googling, otherwise I would have used man in the first place).

So I switched to the ChatGPT tab, asked, got the exact command I wanted, done.

Thanks for the example!

While I would definitely, definitely think critically before running a command from the internet, I can see how that use case makes sense for ChatGPT if you’re willing to roll that way.

>…take a few minutes to skim through and look for the relevant options (which is what I wanted to avoid by Googling…

Ah, I see! We are using the phrase “googling” differently here. I explicitly mean googling as a verb that involves reading and parsing information, not the act of avoiding reading or parsing information.

The crucial component is that you verified the result (by running the suggested command and observing it giving you the expected result).

This is different from asking it for an explanation or definition of something and not cross-checking every "fact".

> Can you give some examples of how ChatGPT “does better” than a human searching for something and then using their own cognition to make sense of it?

Certainly! ChatGPT outperforms human search and cognitive processing in several ways. For instance, when you require a quick, step-by-step tutorial to fix an issue in an application or program, ChatGPT excels. Instead of searching for an online tutorial yourself, ChatGPT provides you with immediate, on-the-fly instructions. This saves you from sifting through search results and having to verify the quality of each tutorial before attempting them. ChatGPT's instructions are already evaluated based on probability, eliminating the need for time-consuming quality checks. By receiving a concise, step-by-step list, you can swiftly proceed with your task, making it significantly more efficient than relying solely on manual searching and critical evaluation.

I’m sorry, I should be more clear by what I mean by example. Rather than hypothetical possible categories of tasks, can you give a particular task in which ChatGPT does better than googling information and thinking about the information that’s displayed to you?

A task in this case could be finding out where a particular Vermeer is hung, or which viscosity of oil to put into your car’s engine — a specific concrete task in which ChatGPT provides you with knowledge better than a web search does.

I won’t argue that ChatGPT is better at writing lists than Google is. Google isn’t a website that writes lists so that comparison would be pointless.

I spent hours googling for information about applying for a visa for a pretty disorganised country and struggled to get anywhere. I then asked ChatGPT which gave me a very good explanation of the process, contained relevant keywords I could then use to google and verify, but unfortunately the ultimate URL it sent me to for applying didn't exist. I suspect this was due to it being outdated, and the government changing the link since. Overall it saved me a lot of time over just googling.
I don't see how they said that Googling skips critical thinking. The issue is that for a lot of queries, Google returns a bunch of content mill pieces that have similar quality to ChatGPT. ChatGPT tends to be more focused on what you asked for, while the content mill pieces are deliberately written to keep you reading as long as possible without answering the question.
>ChatGPT tends to be more focused on what you asked for

…even if it has to invent people, places, things or events to satisfy your focus.

Do you have a particular example where ChatGPT does better than googling something? I’m especially interested in your example since we agree that googling entails using humans cognitive ability to make sense of things.

For me, it boils down to SEO spam. Gaming pagerank has led to many SEO spam blog sites ranking relatively high, but with a low signal/noise ratio. Often times, it is a lot faster to ask ChatGPT something, compared to wading through endlessly verbose Google results.
For me Google is more of a "router" to sites I know will likely provide good info, before Reddit was nerfed, I'd always include that in my Google query, same for SO etc. It worked great and fast.
> ChatGPT will replace the concept of googling things

FastGPT[0] is some way towards that: it uses the web to provide answers

[0]: https://labs.kagi.com/fastgpt

> Me: What are some questions you cannot answer? > FastGPT: I apologize, but I do not actually have the ability to determine what questions I cannot answer. I am Claude, an AI assistant created by Anthropic.

Good to know ;)

Bing chat is already GPT4 with an internet connection.
The issue is, people think that connecting an LLM to the internet = unlimited information. In reality, it = an LLM treating a webpage as another sources of information. It is no better or worse than literally copying and pasting the information into the text box. If it goes beyond the token limit, you are going to get swiss cheese responses.

Bing is okay, ChatGPT 4 with browe tool is okay and Bard is pretty good, but tends to not use its browse tool enough, and hallucinates mightily.

It already replaces Google for me like 80% of the time.

Not because ChatGPT is an infallible oracle, far from it. It's just that the bar is rather low. Google has become really bad in the last decade or so.

> There is a group of people that are ardent in the conviction that if we wait long enough and wish hard enough that ChatGPT will replace the concept of googling things.

Have you tried Phind?^[1] The project explores exactly how an AI-driven search tool might look like.

The thing that really makes it shine, is that it doesn't only make claims per text, but that it also provides source links that it has already scraped for info.

[1]: https://www.phind.com/

That’s a neat tool to use while googling stuff!
Exactly, you can't go to the moon by climbing sucessively taller trees.
I used to Google things, but then it started returning false information. 75% of the time I use Google, and it summarizes the answer to a question at the top of the results page, it's erroneous. So, which is better?
I like to click on the links in the search results, so I would say that Google is better for finding information.
So, when it helped me solve a math problem by noticing a trick that I had missed – which was clearly the correct thing to do in retrospect – was I somehow in the wrong to ask it for help?

Getting answers out of it is absolutely a reasonable thing to do. Blindly trusting those answers without verification, that's another thing entirely.

> So, when it helped me solve a math problem by noticing a trick that I had missed – which was clearly the correct thing to do in retrospect – was I somehow in the wrong to ask it for help?

There's nothing wrong in using LLMs.

> Getting answers out of it is absolutely a reasonable thing to do. Blindly trusting those answers without verification, that's another thing entirely.

We're basically saying the same thing but with different words: in my wordings, if you don't trust the answers and double check later, it's not “answers” it's merely “hints” or “suggestion”.

But here we have someone that just copy-pasted the response as if they were quoting a source.

Fair enough. Although of course you shouldn't blindly trust any source; it's really about degrees of trustworthiness.
Of course. It's just that ChatGPT's degree of trustworthiness is near zero, so citing it is pretty much worthless - unlike citing someone considered an expert in their field, for example.

It may still be a very useful tool, but not when used this way.

Its trustworthiness is not near zero…

There are many categories and types of questions where you can have a good degree of confidence in the answer. This is especially true for questions where a rough approximation is acceptable.

Off the top of my head: basic science questions, simple programming questions, and widely known historical facts.

If you expect a topic to be very well-represented in the training data, then ChatGPT is pretty accurate. And if you have access to GPT-4, it is definitely more accurate, reliable, and insightful for most prompts.

I prefer using it on questions like this for a few reasons. I find the experience to be so much smoother and more pleasant. The ChatGPT UI is incredibly minimal and clean. And then it allows asking follow-up questions, which is very useful.

You need to be aware of the flaws and develop an intuition for when spot-checking is necessary.

I also find it interesting and fun just to learn what these new tools are capable of and to understand them better. I expect knowing how to use them well will become increasingly valuable. Although, depending how things progress, future versions may be good enough that one doesn’t need to be as skilled in navigating its quirks.

> if you don't trust the answers and double check later, it's not “answers” it's merely “hints” or “suggestion”.

The same is true of Google search results and Stack Overflow answers and academic journal articles and so on. No source of information can be entirely trusted and they all require verification.

And?

It still read more knowledge than I will ever be.

And I also often enough believe/ listen to humans who hallucinate.

Just look how many humans believe in a god and a book some people wrote somehow, and I still take some of those people seriously.

Have you tried GPT-4? You might be surprised at how much better it is that the free version. It genuinely dies provide good answers to most things you can throw at it. More to the point language models seem to become knowledge models if they get big enough
Language is knowledge though, knowledge isn't predicated to be correct. There are people who "know" the earth is flat, that is knowledge they communicate with language (and is probably in the training set). Religious practitioners communicated their knowledge of the gods via language, and there are bodies of knowledge on the subject that are counter to eachother (both probably in the data set). It serves that a language model does hold knowledge, you just can't trust that it's incorporating the right knowledge, and some of it is entirely novel as a consequence of the process.

I think where it differs for me is humans make stuff up for a different reason, humans invent knowledge to fill gaps in understanding, where invention of knowledge by an LLM is a side effect of it's attempt to complete some text.

> knowledge isn't predicated to be correct.

It kind of is, though. The field of philosophy called epistemology deals with what knowledge is and how it is obtained, and the field overwhelmingly agrees that it is necessary (but not sufficient) for knowledge to be a (1) belief that is (2) true, and (3) justified. There is some disagreement when it comes to justification, and a lot of nuance in whether every justified true belief is knowledge, but pretty much no disagreement that knowledge must be a thing that is true and involve the mental state of believing that the thing is true.

Language doesn't imply belief (1), and it doesn't imply justification (2), and it certainly doesn't imply truth (3). Language is a medium in which knowledge can be expressed, but not knowledge itself (one can of course have knowledge about language, but that is not the same thing). Language is also a medium in which things that aren't knowledge can be expressed, such as falsehoods, nonsense statements, paradoxes, etc. LLMs generate language without belief, they sometimes do generate statements that are true and justified, but definitely not always.

I think the terms "data" or "information" are better than "knowledge" for what LLMs are trained on and produce.

(1) "The sky is green." Lying in general, are ways to use language to express something that isn't a belief. Also imperatives like "Please take out the garbage" or questions like "Did you take out the garbage?" don't seem to express a belief at all

(2) "There is a planet somewhere in the universe where s'mores naturally occur without intentional assembly by intelligent beings." This might be true, and I might believe it, but there is no justification for it. Still expressed in language. Questions and imperatives come up here too.

(3) See (1), or any other case of lying or being mistaken.

A statement doesn’t have to be true to be justified.

The example I’ve heard of knowledge that is believed, untrue but justified is “porcupines can shoot their quills”.

A tribe that believes this fact is more likely to stay away from porcupines, and even though porcupines do not in fact shoot their quills, over time this superstition is beneficial to the tribe as their neighbors are more likely to get hurt or killed by dangerous porcupine quills.

Judaism taught long before modern germ theory revealed why it was beneficial in a purely rational sense that burying feces and ritualistic hand washing were important to please God.

"The field of philosophy called epistemology deals with what knowledge is and how it is obtained, and the field overwhelmingly agrees that it is necessary (but not sufficient) for knowledge to be a (1) belief that is (2) true, and (3) justified."

How do you know if something's true, though?

If knowledge of truth has to be true, how do you know that the true knowledge of truth really is true? Etc..

In philosophy, the most common definition of knowledge is that it is true, justified belief[1]. In that definition knowledge certainly is predicated to be correct.

[1] https://plato.stanford.edu/entries/knowledge-analysis

So is an argument that an LLM has no knowledge because it can't reason about justification of it's replies? I could see that. A lot of justification is just "because I read it in a book I trust" or "heard it from an authority" which isn't all that dissimilar to an LLM's outcome. An LLMs response is also justified by it's training, not unlike a person in a position of authority.

It doesn't challenge the idea that both a religious believer and non-believer have a certain knowledge of the world. These kinds of conflicts of knowledge are common, not just in religion, there is no single understanding of the world to reference against.

Knowledge not necessarily being correct, and language being the same thing as knowledge are big ideas! See for example:

https://en.wikipedia.org/wiki/Gettier_problem

https://en.wikipedia.org/wiki/Philosophical_Investigations

Language is mostly lies.
Not like humans are much different, especially those I have immediate access to.
But somehow people don't routinely quote Bob their co-worker in internet discussions. We know that humans around us aren't reliable source of knowledge, but somehow some people are quoting ChatGPT as if they were quoting an expert in the field.
> all it can say about them is pure hallucination

Could another way of putting that be, it performed a guess based on a heuristic?

> so all it can say about them is pure hallucination

This "hallucination" come along a lot recently. Is it a legit concept or just "the dog ate my homework" type of excuse for anything?

I mean, does the human mind also "hallucinate" all the time? Why do we expect from an "artificial" mind to outperform us?

> This "hallucination" come along a lot recently.

Couldn’t exactly be otherwise given how young GPT is. ChatGPT was released a bit under 7 months ago.

> Is it a legit concept or just "the dog ate my homework" type of excuse for anything?

It’s an analogy for how LLMs work. An LLM does not know anything, it just adds tokens probabilistically based on the previous tokens.

So essentially it always hallucinates (makes shit up as it goes along, if you prefer).

Thanks to the model it’s generally quite credible, and often even lines up with actual reality, but it should not be confused for knowledge.

That’s why it will confidently give you citations it just made up, to papers or decisions it’ll happily make up as well (though less and less credibly as things get closer to hard facts).

> It’s an analogy for how LLMs work. An LLM does not know anything, it just adds tokens probabilistically based on the previous tokens

This seem a deep statement that keeps getting repeated, but it doesn't mean anything. The probabilistic model that is used to decide the next token could be arbitrarily complex, including encoding knowledge (or just asking a panel of experts).

It seems pretty self evident that the model in fact encodes knowledge, just in a very lossy way and recall is also flawed.

It sure does encode some knowledge, because it's a language model and languages already do so on their own. It's far from what you'd usually call a "knowledge model" though.
Which is why "hallucination" is really the wrong word to use, "confabulation" would be more proper. But "hallucination" has stuck because it's the word used back when people first figured out the trick of running image classifiers "in reverse" to generate images from noise.
Sure but nobody knows the word “confabulation”, and lying / making things up implies intent.

So “hallucination” hews close enough to have good explanatory powers.

Confabulation is unintentional, FWIW:

> In psychology, confabulation is a memory error defined as the production of fabricated, distorted, or misinterpreted memories about oneself or the world. […] Confabulation occurs when individuals mistakenly recall false information, without intending to deceive.

Yes, which is why I agree that it’s a better term. That’s not the issue.
> it just adds tokens probabilistically based on the previous tokens

I mean, isn't this what humans do all the time? Bullshitting random topics on the Internet, except humans tend to add disclaimers like "I am not a lawyer but" and stuff.

> I mean, isn't this what humans do all the time?

No? Most humans don’t randomly vomit text based on what sounds good.

> Bullshitting random topics on the Internet, except humans tend to add disclaimers like "I am not a lawyer but" and stuff.

Which shows a much higher level of understanding, both of the field (which may be flawed), and of their own understanding of the field (which they point out).

An LLM does not to that, it doesn’t just repeat potentially wrong hearsay or incorrect memories (let alone having actual understanding and knowledge of the field), it confidently writes out delusions.

> Most humans don’t randomly vomit text based on what sounds good.

Unless humans were given a task? e.g. taking exams while un-prepared.

My kid usually gives me a long description of imaginary stuff based on the name only or brief intro. It's very fun when finally the real deal was revealed.

That's absolutely right. That said, people don't usually take exam output of unprepared students and expect it to be useful :)
"As a Language Model" is the new "I am not a lawyer"
Regarding making language, I think the human mind hallucinates not unlike GPT. Humans say a lot of stuff because they feel vaguely it is true. So does an LLM when it talks about things it’s underfitted for.

Anyways, hallucination is a term in generative AI. It means that the model produces results inconsistent with their training data. Or that’s what people say, sometimes the training data is just not that good.

> It means that the model produces results inconsistent with their training data. Or that’s what people say, sometimes the training data is just not that good.

If you ask a real person to put together an essay on an obscure topic without extensive research, I bet 80% of the content is made up "hallucinations"

Well, that's hardly surprising. Asking random people to put together essays on obscure topics without extensive research is a great way to produce essays as useful as ChatGPT's output ;)
It "came along a lot recently" because everything ChatGPT is "recently" -- it only came out 6 months ago.