Hacker News new | ask | show | jobs
by mk_stjames 793 days ago
I also believe that OpenAI's RLHF process highly biased the model towards producing that sort of specific verbose, padded-out 'chatGPT speak' that we are seeing. The RLHF fine tuning process took outputs from the instruction-tuned model and A/B tested variations with human workers who may or may not have had the same feelings towards writing quality as many of us.

The process resulted in those verbose, interjection-laden responses that we see now, because that type of response was deemed 'better' (thumbs-up'd more) than the shorter, more-direct-but-less-impressive-sounding responses.

1 comments

My chats usually start with:

>Hi again :wave:.

>As always I prefer terse replies.

>Let's <context of activity>

><First question>

I usually then get short answers and then I query for more info if required.

Set a custom instruction (I think only a ChatGPT paid option?). But $20 a month or whatever is easily worth it for the utility it provides.
I use it for work so much I’d gladly pay way more than $20 just for access to GPT-4. It’s pretty terrible at programming but it still saves me loads of time generating the easy functions.

Anything remotely complex I still do by hand. But holy shit its nice having something do my boilerplate.