| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by websitescenes 805 days ago
	There is a huge backlash coming when the general public learns AI is plagued with errors and hallucinations. Companies are out there straight up selling snake oil to them right now.

7 comments

kibwen 805 days ago

Observing the realm of politics should be enough to disabuse anyone of the notion that people generally assign any value at all to truthfulness.

People will clamor for LLMs that tell them what they want to hear, and companies will happily oblige. The post-truth society is about to shift into overdrive.

link

Ekaros 805 days ago

It depends on situation. People want their health care provider to be correct. Same goes with chat bot when they are trying to get support.

On other hand at same time they might not want to me moralized to like told that they should save more money, spend less or go on diet...

AI providing incorrect information in many cases when dealing with regulations, law and so on can have significant real world impact. And such impact is unacceptable. For example you cannot have tax authority or government chatbot be wrong about some regulation or tax law.

link

terminalcommand 805 days ago

But tax authorities are also quite often wrong about regulations and laws. That is why objection procedures exist. Legal system is built on such fail-safes. Even judges err on laws some times.

If you call the government tax hotline and ask a question not written under the prepared questions list, what would you expect would happen? The call center service personell is certainly not expert on tax laws. You would treat it suspiciously.

If LLMs can beat humans on the error rate, they would be of a great service.

LLMs are not fail-proof machines, they are intelligent models that can make mistakes just like us. One difference is that they do not get tired, they do not have an ego, they happily provide reasonings for all their work so that it can be checked by another intelligence (be it human or LLM).

Have we tried to establish a counsel of several LLMs to check answers for accuracy? That is what we do as humans in important decisions. I am confident that different models can spot hallucinations in one another.

link

rohansingh 805 days ago

Just to be really clear since I had to call the IRS tax hotline the other day... they are real experts over there.

And generally, people will tell me, "I'm not sure" or "I don't know". They won't just start wildly making things up but stating them in a way that sounds plausible.

link

intended 805 days ago

“What is your error rate?” This is the question where this sub genre of LLM ideas goes to die and be reborn as a “Co-pilot” solution.

1) Yes. MANY of these implementations are better than humans. Heck, they can be better at soft skills than humans.

2) How do you detect errors? What do you do when you give a user terrible information (Convincingly)

2.2) What do you do now, with your error rate, when your rate of creating errors has gone up since you no longer have to wait for a human to be free to handle a call?

You want the error rate, because you want to eventually figure out how much you have to spend on clean up.

link

terminalcommand 805 days ago

But LLMs always advertise themselves as a "co-pilot" solution anyway. Everywhere you use LLMs they put a disclaimer that LLMs are prone to errors and you need to check the responses if you are using it foe something serious.

I agree that it would be better if the LLMs showed you stats on utilization and tokens and also an estimated error rate based on these.

link

intended 805 days ago

Survivorship bias - those are the people who get in front of a user base.

There are many more who start out with “this is going to replace X”, where X is analysts, doctors, agents, quality teams, teachers, HR teams etc.

link

Mawr 805 days ago

No, LLMs do not make mistakes just like us.

- "Dad, is that mushroom safe to eat?"

- "Hmm, I'm not sure, but let's stay safe and not eat anything we aren't certain about."

---

- "LLM, is that mushroom safe to eat?"

- "Yes, that is the <wrong type of mushroom>, go right ahead!"

LLMs don't have common sense and they're never going to get it. Thus, their output cannot ever be trusted.

link

Loughla 805 days ago

This is shockingly accurate. Other than professional work, AI just has to learn how to respond to the individual's tastes and established beliefs to be successful. Most people want the comfort of believing they're correct, not being challenged in their core beliefs.

It seems like the most successful AI business will be one in which the model learns about you from your online habits and presence before presenting answers.

link

llamaimperative 805 days ago

Of course people generally value truthfulness. That value commonly being trumped by other competing values doesn’t negate its existence.

I don’t think defeatism is helpful (or correct).

link

skilled 805 days ago

Exactly. This is super evident when you start asking for more complex questions in CS, and when asking for intermediate-level code examples.

Also the same for asking about apps/tools. Unless it is a super known app like Trello which has been documented and written about to death - the LLM will give you all kinds of features for a product, which it actually doesn’t have.

It doesn’t take long to realize that half the time all these LLMs just give you text for the sake of giving it.

link

kylebenzle 805 days ago

Agreed, calling an LLM "AI" is just silly and technically makes no sense, they are text generators based on text context.

link

sph 805 days ago

Meanwhile, on LLM hype threads, people are already thinking about AGI, when we haven't even cracked basic intelligence.

link

terminalcommand 805 days ago

Respectfully, I think we cracked basic intelligence. What do you imagine under basic intelligence?

LLMs can do homeworks, pass standardized exams, give advice WITHOUT ANY SPECIFIC TRAINING.

You can invent an imaginary game, explain the rules to the LLM and let it play it. Just like that.

You can invent an imaginary computer language, explain the syntax to the LLM and it will write you valid programs in that language. Just like that.

If that is not intelligent I do not know what is. In both cases, the request you put in is imaginary, exists only in your head, there are no previous examples or resources to train on.

link

sph 805 days ago

> Respectfully, I think we cracked basic intelligence. What do you imagine under basic intelligence?

It all depends on your definition of intelligence. Mine is the ability to solve novel problems.

AI is unable to solve novel problems, only things it has been trained against. AI is not intelligent, unless you change the very definition of the word.

link

terminalcommand 805 days ago

I challenge you to imagine an imaginary game or computer language, explain the rules to the LLM. It will learn and play the game (or write programs in your invented language), although you imagined it. There was no resource to train on. Nobody knows of that game or language. LLM learns on the spot with your instructions and plays the game.

I cannot understand grad school level mathematics even if you give me all the books and papers in the world. I was not formally trained in mathematics, does that make me not intelligent?

link

orwin 805 days ago

If LLM could invent consistent imaginary games (or anything, like a short novel, or a 3 page essay on anything it want), maybe i would agree with you. The issue is that anything it create is inconsistent. The issue might be an artificial limitation to avoid copyright issues, but still.

link

terminalcommand 805 days ago

Actually my argument was the opposite. We as humans can imagine a game, explain it to the LLM and it learns, consistently, every time.

Generating new games is something else, that is creativity not merely intelligence.

link

latexr 805 days ago

> What do you imagine under basic intelligence?

Consistency, for one. I have asked LLMs the exact same question twice in a row and got wildly different answers. Intelligence presupposes understanding. When I ask an LLM “give me the first X of Y” and it replies “I cannot give you the first X of Y because there have only been X+10, here’s the first X+5 instead”, I’m hard pressed to call it intelligent.

link

terminalcommand 805 days ago

Have you tried specifying you field of inquiry which was algebra. Try saying solve this equation for me. I am a lawyer by day so I constantly face limitations of natural languages. The solution is to write less ambiguous prompts.

link

terminalcommand 805 days ago

I disagree. They are not just text generators. LLMs are increasingly being multimodal they can hear and see.

We humans are also text generators based on text content. What we read and listen to influences what we write.

Llms are intelligent at least as us humans, they can listen, read, see, hear and communicate. With the latest additions they can also recall conversations.

They are not perfect. Main limitations are computing power available for each request and model size.

Have you tried Claude Opus 3 or GPT 3.5 or Gemini?

Microsofts copilot is dumb (I think they are resource constrained). I encourage everyone to try at least the 2-3 major LLMs before giving a judgement.

link

terminalcommand 805 days ago

Asking LLMs for imaginary facts is the wrong thing here, not the hallucination of the LLMs.

LLMs have constraints, these are computation power and model size. Just like a human would get overwhelmed if you request too much with vague instructions LLMs also get overwhelmed.

We need to learn how to write efficient prompts to use LLMs. If you do not understand the matter, be able to provide enough context, the LLM hallucinates.

Currently criticising LLMs on hallucinations by asking factual questions is akin to saying I tried to divide by zero on my calculator and it doesn't work. LLMs were not designed for providing factual information without context, they are thinking machines excelling at higher level intellectual work.

link

dagw 805 days ago

akin to saying I tried to divide by zero on my calculator and it doesn't work

The big difference is that if I try to divide by zero on my calculator, it will tell me it doesn't work and perhaps even given me a useful error message. It won't confidently tell me the answer is 17.

link

JHonaker 805 days ago

> Currently criticising LLMs on hallucinations by asking factual questions is akin to saying I tried to divide by zero on my calculator and it doesn't work. LLMs were not designed for providing factual information without context, they are thinking machines excelling at higher level intellectual work.

I would agree with you, but they're currently billed as information retrieval machines. I think it's perfectly valid to object to their accuracy at a task they're bad at, but being sold as a replacement for.

link

terminalcommand 805 days ago

This reminds me of movies shot in early times of the internet. We were warned that information on the internet could be inaccurate or falsified.

We found solutions to minimize wrong information for example we built and maintain Wikipedia.

LLMs will also come to a point where we can work with them comfortably. Maybe we will ask a council of various LLMs before taking an answer for granted, just like we would surf a couple of websites.

link

frizlab 805 days ago

A human would (should?) tell you “I’m overwhelmed, leave me alone!”

AI just spits out “stuff…”

link

terminalcommand 805 days ago

That's true, LLMs do not say I cannot understand I am overwhelmed at this stage. That is big drawback. You need to make sure that the AI understood it.

Some LLMs stop responding midway if the token limit is reached. That is another way of knowing that the LLM is overwhelmed. But most of the time they give lesser quality responses when overwhelmed.

link

esailija 805 days ago

Because it doesn't understand or have intelligence. It just knows correlations, which is unfortunately very good for fooling people. If there is anything else in there it's because it was explicitly programmed in like 1960's AI.

link

terminalcommand 805 days ago

I disagree. AI in 1960s relied on expert systems where each fact and rule was handcoded by humans. As far as I know LLMs learn on their own on vast bodies of text. There is some level of supervision, but it is bot 1960s AI. That is the reason we get hallucinations as well.

Expert systems are more accurate as they rely on first order logic.

link

add-sub-mul-div 805 days ago

Coming? I think the general public has already come to consider "AI" synonymous with hallucination, awkward writing, and cringe art.

link

bayindirh 805 days ago

No. From my experience, many people think that AI is an infallible assistant, and even some are saying that we should replace any and all tools with LLMs, and be done with it.

link

kylebenzle 805 days ago

You can probably safely ignore those people's opinions moving forward.

link

bayindirh 805 days ago

Instead I keep this list updated: https://notes.bayindirh.io/notes/Lists/Discussions+about+Art...

link

mannykannot 805 days ago

Not once some of them incorporate hallucinating AI into important products and services.

link

pyrale 805 days ago

The art part is actually pretty nice, because everyone can see directly if the generated art fits their taste, and back-and-forth with the bot to get what you want is actually pretty funny.

It gets frustrating sometimes, but overall it's decent as a creative activity, and because people don't expect art to be knowledge.

link

duxup 805 days ago

Are they?

Every AI use I have comes with a big warning.

The internet is full of lies and I still use it.

link

drewcoo 805 days ago

> Companies are out there straight up selling snake oil to them right now

Well snake oil sells. And the margins are great!

link

kylebenzle 805 days ago

Yes, calling an LLM "AI" was the first HUGE mistake.

A statistical model the can guess the next word is in no way "intelligent" and Sam Altman himself agrees this is not a path to AGI (what we used to call just AI).

link

pixl97 805 days ago

>is in no way "intelligent"

Please define the word intelligent in a way accepted by doctors, scientists, and other professionals before engaging in hyperbole or you're just as bad as the AGI is already here people. Intelligence is a gradient in problem solving and our software is creeping up that gradient in it's capabilities.

link

alienicecream 805 days ago

Intelligence is the ability to comprehend a state of affairs. The input and the output are secondary. What LLMs do is take the input and the output as primary and skip over the middle part, which is the important bit.

link

throwawaysleep 805 days ago

Humans are also error plagued. AI just needs to beat them.

link

llamaimperative 805 days ago

No, AI also needs to fail in similar ways as humans. A system that makes 0.001% errors, all totally random and uncorrelated, will be very different in production than a system that makes 0.001% errors systematically and consistently (random errors are generally preferable).

link

throwawaysleep 805 days ago

In customer service, this appears to be how humans fail. They don’t know, so make up a policy.

link