Hacker News new | ask | show | jobs
by LouisSayers 743 days ago
Not just scientists, but everyone!

My partner recently went a bit nuts writing an article with the help of GPT4. She was very proud of how productive she'd been until I asked if she'd actually searched for the papers GPT4 had referred to.

Of course, many of the referred to papers didn't exist...

4 comments

That is not writing with the help of GPT 4, that is letting it write for you! I can’t imagine doing anything creative and letting a computer source material for me without having reviewed the material first hand, even if it was accurate. Clearly, this is not where everyone’s head is at, and I suspect your wife’s workflow is more the common case.

I’ve said from the outset that in academic settings you should be able to cite an AI as a writing assistant, it would clear up a lot of the confusion about its use. If you used it poorly it’s still on you, but at least there’s some transparency by which to judge the work.

I've sort of worked out a workflow. Like say I had to write an essay and take a side for/against something. Then I would ask GPT to write the strongest argument for, and the strongest argument against, telling it to make up whatever sources it wants. Then after reading those, I would have some idea of my own opinions. I would write from scratch but with the GPT for/against pulled up alongside as reference for how to structure the arguments. Then I would put it through GPT again for proofreading and grammar (or just spelling, if there is AI detection software).

It is a bit tricky though, there are definitely points that come up with GPT that people would not think of normally. So in that sense it is still distinguishable from writing solely by oneself, but I would argue the GPT-assisted essays are just better writing and more well-rounded.

There is a subtle aspect of LLM AIs that is lost to most people: they are trained on the entirety of the Internet. That means whatever topic you ask these LLM AIs, there are multiple instances of that same information with different levels of seriousness and accuracy in their treatment of the subject.

For example: if one asks a question using street slang, the answer generated will be generated from training data about your subject, but from online sources that used street slang in their conversation about that issue. Likewise, if you use ordinary language for your question, the generated response will be from ordinary language conversations of your topic. However, if your question concerns any type of formalized knowledge, by asking your question using the formal language of experts in that topic, then the generated AI answer will come from training data that used this same formal expert terms, and are most likely to be correct, because they come from discussions of that subject’s matter experts.

Plus, don't use LLMs for fact retrieval, use them as strategy guides. They really excel as strategy advisors.

Theres actually even more subtlety here, in all of your examples the "knowledge" should theoretically be embedded nearby each other in the same vector space, so regardless of the style of language used, semantically they should all pull from similar weights, and thus give similar answers. This is one of the reasons why LLMs are so powerful.. because they seemingly understand the semantic relationships of words so regardless if the prompt is posed casually or formally it should give similar answers in terms of factuality. I agree with you that LLMs today should be primarily used for more creative output.
That assumes that street slang discussions, using entirely different conceptualizations of ideas, would indeed be embedded nearby one another. Plus, both the street slang and ordinary language will tend to treat the information in a less precise, a less concept discriminating manner (meaning the subtle distinctions between issues may be lost in their discussions). In my tests, I find one indeed needs to use the subject matter expert for precise treatment of formal knowledge and generated answers that are more accurate.
You sound like the people who used to know how to fix a car, or sew, or write cursive, or do multiplication times tables in their head, or know how to derive a formula, or check a mathematical proof.

Ask anyone below 30 if they can write cursive today, or know their times tables hehe. Ask them if they can derive a formula instead of using Mathematica.

Or ask a developer if they know how their pixel shaders work, or what’s going on under the hood of their favorite runtime, how hash tables work, or really anything. Previous generations did. When the complexity gets too high people just trust the machines I guess.

And no one actually knows what the LLM internals are anyway.

If you're driving you don't need to know how to fix a car, but relying on GPT to write for you to the extent of accepting its generated citations without checking them, is the equivalent of running around looking for blinker fluid as you attempt to fix your car.
personal anecdata: I had written a few paragraphs of factual information, and decided to see what GPT 4 would do with it. So I asked it to rewrite the information several times, using a different voice (e.g. write it with an optimistic view, pessimistic view etc)

EVERY single "fact" was perverted by either mixing with another fact, or misrepresenting by replacing a word like "good" with "superb" or "fantastic" (I guess optimistic means lie-through-your-teeth?)

YMMV, but basically I achieved nothing except a waste of about 30mins and an honest, personal evaluation of the limits of GPT.

It's kinda scary to think that researchers would be using ChatGPT other than a rubber duck to bounce ideas off of.

There are a couple issues I can see in that people may be unaware of how much the AI's hallucinate, but also there's a real probability that people will pick and choose what they like based on what sounds correct vs what is correct.

AI is a great tool, but it's also convincingly deceiving at times, so much so that many people are totally oblivious to it.

Sadly it's not even just references, LLMs still hallucinate or at least misrepresent even the most basic of facts. That and the stereotypical GPT-verbage makes it impossible to use for writing anything significant.
> LLMs still hallucinate

Keep in mind that there's no difference between what happens inside a model when it "hallucinates" vs. when it generates "correct" output. It's the exact same process.

That’s true, but it’s also true of anything else that makes mistakes, including buggy software. When a buggy sorting algorithm produces a bad ordering it’s doing so “with the exact same process” the good ordering is coming from. Ditto for humans and their slips (although tbh I get a little tired of the analogizing of humans and llms…not that the analogies are wrong, but just that we always analogize human minds with the latest technology: wax writing pads through computers)
Uh... yes? I'm not sure why it's some significant insight.

Surely when google gives bad results, it's "the same process" as when it gives good results. And when a book gives wrong information, it's the exact same kind of ink as correct information.

I think the point is that it's not some kind of bug to find and fix, it's a fundamental risk with the entire approach.
We were already swimming in a world of bullshit prior to the wide availability of these. I'm not sure what the future holds, but I think intelligent people are going to become very skeptical of virtually all information sources.

I would imagine there's also a raft of people who will use it as a reason to give up on any search for truth.

I still do hold a lot of hope for their eventual capabilities, but I'm also pretty pessimistic on what the direct and Nth order social effects will be.

GPT-whatever can’t do sources.

I was trying to use it as a research tool and it hallucinated 95% of the references I asked for (not a made up percentage, I counted)

Ironically the one real source turned out to be quite useful.

Hmm. In the future the AI in nefarious hands can retroactively make the papers first, and get them past the censors. Just make up a lot of bullshit and then it’s turtles all the way down lmao
I'm imagining how much easier it would have made work for the Ministry of Truth in 1984.