Hacker News new | ask | show | jobs
by PeterisP 669 days ago
I don't want to look for the source of analysis right now, but I recall reading a study demonstrating that a large part, if not most of the word frequency shift was caused by RLHF training done on data predominantly generated by people hired from lower income English-speaking countries which simply have a different dialect of English with a noticeably different frequency of certain phrases and expressions, so e.g. at least some versions of ChatGPT got RLHF-trained to speak more in a Nigerian English dialect.

Since there isn't a single English (English learners generally get informed about the choice of UK vs US English only, but most English is spoken outside of UK and USA in other places and other dialects), but multiple different Englishes, any English speaker will probably find something to be surprised by, and there is an economic incentive to get data from people other than the relatively expensive native speakers of UK or USA English.

2 comments

There wasn't a study or analysis. It was just lazy speculation that felt good because it could be bound up in a "evil white countries exploiting the developing world" narrative. Where exploiting was "paying to do a job".

It was submitted as https://news.ycombinator.com/item?id=40623629

Again, there is effectively zero real data showing this. Further, RLHF isn't likely to reinforce such word selection regardless.

A more logical, likely scenario is that training data is biased heavily towards higher grade level material, so word selection veers towards writings that you find in those realms.

> It was just lazy speculation that felt good because it could be bound up in a "evil white countries exploiting the developing world" narrative. Where exploiting was "paying to do a job".

Exploitation like that is in fact happening (see pretty much everything having to do with social media content moderation and RLHF to avoid disturbing content.

Also "paying to do a job" is not the moral panacea you seem to think it is.

tinfoil had theory: they implanted watermarks already, so that AI generated text can be flagged for future training runs or as a service, such that some phrases are coaxed to become statistical beacons.
That's not really a tinfoil hat theory. That's been possible for some years and OpenAI reportedly does watermark their outputs, and can detect it. They just haven't released it as a service because it'd annoy all the users who are using it for cheating :)
I believe that if that was possible to do on purpose, they wouldn’t have so much trouble preventing the LLMs from talking about things they shouldn’t.
Yeah I would like to see some evidence of this too. It's just asserted as truth in the article. Delve doesn't seem like a particularly unusual word to me, especially in the context of scientific abstracts, and LLMs could totally learn random weird things. How common is "it's important to remember" in Nigeria?
Wait, why wouldn’t RLHF influence word choices?
I didn't say it wouldn't (or rather couldn't), I said it was unlikely for the selected hypothesis given standard training data vs RLHF iterations.
then again, most history consists of whitewashing back when northern countries were exploiting everywhere else in various ways: imperialism, colonialism, neocolonialism, capitalism, financialization,...

typical people prefer to pretend this is simply "order" and "progress"; seemingly blind to their own ideological baggage like fish in water

Yeah, right. ChatGPT was trained on Pidgin English dialect.

Have a look at BBC translation to get a taste, and tell me its not hoax: https://www.bbc.com/pidgin