| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wegfawefgawefg 946 days ago

I dont think the theories here about why chatgpt puts out such bland content are correct.

I don't think it is bland due to an averaging effect of all the data.

The reason I dont think that is the case: I used to play with GPT3 and 3 was perfectly capable of impersonating any insane character you made up, even if that character was extremely racist or had funky spech, or was just genuinely evil. It was hilarious and fun.

gpt4's post training is probably what caused the sterility. I expected gpt4 to be the same until I played with it and was so dissapointed by its lack of personality. (Even copilot has personality and will tell jokes in your code comments when it gives up)

4 comments

kromem 946 days ago

It's exactly this.

You could see the difference in GPT-3 before they depreciated the TextCompletion API.

There's no way that telling a model that it is "a large language model made by Open AI that doesn't have feelings or desires" as an intermediate layer before telling it to pretend to be XYZ is going to result in as good a quality as simply directly telling a LLM it is an XYZ.

The one area this probably doesn't negatively impact too severely are things like Big-Bench or GLUE. So they make a change that works fine for a chatbot and then position that product as a general API that kind of sucks other than the fact it's the SotA underlying model.

As soon as you see direct pretrained model access to a comparable model by API, OpenAI's handicapped offerings are going to pale in comparison and go out of style for most enterprise integrations.

And this is fine and completely safe to do, as long as they are running a secondary classifier on the output for safety instead of baking it into the model itself. So it's possible to still have safety without cutting the model off at the knees (it just increases the API per token cost, but probably results in net savings if there needs to be less iterations to get to the quality target intended).

link

vitorgrs 946 days ago

Yes! Anyone who used Bing prior to Microsoft "censoring" know how powerful GPT4 is... Just search "Bing Sydney" and be surprised... (I fully believe Bing was launched prior to GPT4 RLHF)

link

voitvoder 946 days ago

I disagree.

Sydney like everything with LLMs, was a one trick pony. What makes Sydney stand out is we didn't get enough time to see how limited the trick was. The removal and censoring makes it seem like a bigger deal than it was in reality.

I have had this experience over and over with generative AI across modalities. The first 10-20 experiences are mind blowing because you don't know what it can't do but then after a 1000 iterations you can see the trick and how limited everything is.

link

vitorgrs 946 days ago

Have you tried Sydney at the time?

I don't believe that's the case. It's just the style of answers and conversation that is radically different. If you see GPT4 paper, you can see that the change was likely made because of RLHF to make GPT4 "safer".

link

wavemode 946 days ago

It's possible this isn't even unintentional. OpenAI probably consider it a plus that content produced by ChatGPT always sounds like a chatbot wrote it, since that helps prevent spam and plagiarism use cases.

The future is in open source models, unshackled from corporate censoring.

link

wegfawefgawefg 946 days ago

Given the RLHF post training, I do believe it was intentional. And I suspect there have been iterations on this to make it more "robust". I vaguely remember there being announcements and such.

link

raincole 946 days ago

Copilot once wrote a comment saying the following code (written by me) should be deleted later. It freaked me out.

link

wegfawefgawefg 946 days ago

When I write raytracing code or fiddly logic for games it will often give me comments along the lines of #I have no idea how this will work, probably should look this up.

link