Hacker News new | ask | show | jobs
by Reventlov 703 days ago
What is this paper ? The tested sentences are full of errors…
4 comments

From the linked paper:

> Phrases were designed to mimic the writing style of elementary school students, including typical spelling errors observed at that age (Quinn, 2020).

Likely to be reflective of user generated content on reddit, etc.
Well, the paper is from a university in Italy. The fact they're writing research papers in English is a benefit to English speakers everywhere, including us :)
What, you don't like fotbal?
Seems like how a user who learned English on Twitter would speak:

football -> fotbal

cousin -> cosin

teacher -> teachr

You can’t fit everything in 160 chars.

Thats' a lame excuse, I grew up pre-twitter with dumb phones sending 160 char SMS messages.
Often, those are ligatures that with another fond seem like a typo. Too lazy to proofread here, but keep it in mind as a possibility.
In reality probably authors of the papers understood that the OpenAI team artificially readjusted these biases through RLHF and that there is nothing to find there, except that it still works when the words are written with typos because no manual examples of “redressing biases” have been provided with such typos.
If they were so clever about it, surely they would have taken the pride to mention this in the study.
It could also be from an ESL user -- fotbal is the Czech, Romanian, and Slovakian word for football, borrowed from the English.