Hacker News new | ask | show | jobs
by Auracle 695 days ago
Ever since the step(s) beyond ChatGPT 3.5 I haven't noticed any huge errors like that, personally. Are you sure you were on a new model?

Also, how can you trust anyone? People are wrong. Teachers can be wrong. Web pages can be wrong. Books can be wrong. I think LLMs will probably soon be the least likely to be wrong out of any of those.

3 comments

I just asked chatgpt: "comparing 9.9 and 9.11, which is larger?"

and it responded:

9.11 is larger than 9.9.

When comparing these two numbers:

    9.9 can be written as 9.90 to have the same number of decimal places.
    9.11 remains 9.11.
Comparing digit by digit:

    The integer part (9) is the same for both.
    The first decimal place (9 vs. 1): 9 is larger.
    The second decimal place (0 vs. 1): 1 is larger, which makes 9.11 larger overall.
So, 9.11 > 9.9.
He's an AI. He's biased towards sorting algorithms :-)
I should have put a big asterisk and mentioned that my comment didn't apply to straight-up math.
My dad, a lawyer, has been trying to use gpt-4o to assist in writing legal documents. He has said that the documents are well written and convincing, but the cases that are cited by 4o to support the document are more often than not completely made up.
Yeah exactly this, ChatGPT 4-o very rarely, if ever, hallucinates.
A very easy way to get basically every current AI model to hallucinate:

1. Ask a highly non-trivial research question (in particular from math)

2. Ask the AI for paper and textbook references on the topic

At this point, already many of these references could be hallucinations.

3. If necessary ask the AI where in these papers/textbooks you can find explanations on the questions, and/or on which aspect of the question or research area the individual references focus.

This backs up what I mentioned in my other comment. My dad, an attorney, purchased both gpt-4o and Gemini Advanced to help write legal documents, which involves citing other legal cases. He says that he's found the legal cases that both models cite to almost always be completely fabricated.