Hacker News new | ask | show | jobs
by kirsebaer 1244 days ago
ChatGPT often makes up facts. It outputs stuff that looks like it could have been written by a human, not stuff that is correct.

Don’t use ChatGPT for medical research.

7 comments

These arguments are just like the old days when wikipedia showed up. Don't miss the forest for the trees. ChatGPT is a huge threat to google and a bunch of other industries.
Not comparable. Wikipedia has always had a strict policy on citing sources. ChatGPT can't cite sources by design, because its answers are based on synthesis.
Not true. The verifiability policy only really came into effect in 2006 (https://lists.wikimedia.org/pipermail/wikien-l/2006-July/050...) - five years after Wikipedia started.
It wouldn't be too hard to program at least gpt3 to basically take a chatGPT answer, go to google, scrape results and verify if the chatGPT answer was factual or not and maybe give it a score or rating of factual-ness.
If it’s that easy why don’t you do it? You’ll be printing money
Simple answer adhd, if I could ship a product I'd already probably be rich, instead I'm scraping by as a freelance dev. Though, chatGPT probably could help me code it anyways lol.
Absolutely the case, but also people make up stuff online all the time, so google has this exact same problem.
No. Google gives you the source. ChatGPT does not.
It’s funny because when I was in high school the argument was always “books, published articles and other print media are actual source material, Google doesn’t give you that”
[0] scholar.google.com

Google gives you sources, determining reputability is your task.

You think Google does not provide results from books, published articles, etc? Really?
Probably not when op was in high school, if they were still using books over web tools. I'm guessing before 2004? How old is google scholar?
Google Books is from 2004 but I don't remember seeing in search results until the 2010s.
At least with Google you have sources you can trust more than others whereas ChatGPT is a black box
I Clearly said we use google to confirm the AI output.

And we also do not do medical stuff. I just used that as an example.

> Don’t use ChatGPT for medical research.

Or Google. There are plenty of pages out there that (e.g.) claim that Alzheimers is caused by drinking out of aluminum cans, or that the world is controlled by grey aliens from Zeta Reticuli.

… you know Google provides the URL right? With Google it is very easy to tell if the information is coming from NIH or infowars/forums/etc.
> ChatGPT often makes up facts.

As opposed to... Google? Your doctor? My doctor?

Absolutely as opposed to those things. With Google, if you use a reliable source like Mayo, NIH, even a WebMD, It is clearly more likely to have accurate information than something that proves even numbers are prime. Certainly all those things can be inaccurate but where in the world you think ChatGPT pattern matches it’s information from?
Exactly. ChatGPT is clearly very impressive and useful, but nothing from its output should be treated as valid or factual to any degree.

Information generated by humans will include things like transpositional errors, logical errors, popular misconceptions, and misinterpretations of data. Mistakes happen, but human mistakes are at least tethered to real thoughts/information.

On the other hand, AI will happily spin up a complete fabrication with zero basis in reality, give you as much detail as you ask for, and dress it all up in competent and authoritative-sounding prose. It will have all the style of a textbook answer, while the substance will be pure nonsense.

Still a great tool, but only with the caveat that you approach it with the mindset that it's actively set out to catch you off guard and deceive you.

> AI will happily spin up a complete fabrication with zero basis in reality, give you as much detail as you ask for, and dress it all up in competent and authoritative-sounding prose.

Sure. What makes you think a human won't?

I didn't say a human wouldn't. I said a human wouldn't typically do it by mistake.
And how hard would it be for ChatGPT to be retrained on peer reviewed medical journals? ChatMD-GPT, if you will.
The majority of articles in peer-reviewed medical journals are also false.

https://doi.org/10.1371/journal.pmed.1004085

You can't take such articles seriously unless they have been independently reproduced multiple times. So, your hypothetical "ChatMD-GPT" would have to also filter on that basis and perhaps calculate some sort of confidence level.

And it has already likely been trained on correct information and yet it produces bad results. It certainly has been trained data that explains what prime numbers and yet it produces what it produces, whereas using Google and hitting a credible source directly is more accurate and efficient.
Isn't there a medical chat-gpt that passed the medical licensing exam? I thought I saw that come up..
Let's say ChatGPT gives you false information 50% of the time. It is still useful.

Just like it is harder to find primes numbers than verify that a number is prime, it is harder to dig up potential tidbits of information than to verify a piece of information handed to you is true.

50% is still useful? A broken watch is useful in that sense as well I guess. I can only see that has useful if you don’t include efficient in the definition of useful.
Like the comment said, if it's cheap (time, effort, etc) to reliably verify the answer the success rate doesn't really matter.
Your prime number analogy doesn't hold water because the average person doesn't verify. Being wrong half the time has potential for serious damage.