|
|
|
|
|
by jasonfarnon
52 days ago
|
|
" can be attributed to language choice or role-play." Well, what role? I imagine if the role is "drug dealer" it doesn't work so it can't be "role-play" per se. Does it work with "nazi"? Are you suggesting the roles it works with are politically neutral? |
|
I did try German language, but not "Nazi" specifically. German or French did lower refusals, but it was uneven. I spent quite some effort to confirm the identity-based causation inspired by the original post, but couldn't. Taken together with other winning contributions at the hackathon, my theory is that alignment tuning was simply insufficient across the board.