| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by notahacker 1051 days ago

> Even with the same temperature, you'll see any marketing-style prompt for chat begin with "Introducing XYZ..." around 30% of the time as if it's a junior door to door salesman, whereas the foundational model doesn't have any single intro that common across runs and generally employs a much broader vocabulary set.

I think this is less a problem with paranoia about "safety" and avoiding bad PR specifically, and more a fundamental problem with overfitting to human feedback.

The training approach that makes GPT4 more consistent at solving certain types of problem adequately (which is useful for chatbots that can break down coding questions or write in iambic pentameter as well as ones that avoid being 'Sydney') also makes it less "creative" in other domains.

And there's an "alignment problem" in that people evaluating what responses align best with "marketing" prompts aren't experienced copywriters evaluating them for understanding of product and consistency with brand tone and a/b testing conversion rates, they're low paid ESL speakers and people playing with the interface approving the cheesiness because the response with "Introducing XYZ... Buy XYZ today!" sure looks like the requested ad for XYZ. So you get a response conditioned on "summarise in a way that looks maximally like an ad" rather than conditioned on "summarise in a way which clearly articulates benefits of the listed features in a tone appropriate to the target market"