Hacker News new | ask | show | jobs
by katzenversteher 491 days ago
I bet a token like "sht!", "f*" or "damn!" would have the same or even stronger effect but the LLM creators would not like to have the users read them
3 comments

It's literally in the article, they measured it and wait was the best token
Maybe, but it doesn't just use it to signify that it's made a mistake. It also uses it in a positive way, such as it's had a lightbulb moment. Of course some people use expletives in the same way, but that would be less common than for mistakes.
I think you're onto something, however, as the training is done through on text and not actual thoughts, it may take some experimentation to find these stronger words.