Hacker News new | ask | show | jobs
by roenxi 21 days ago
It'd be interesting to see this retried with an open model so the standard and decensored model could be compared. That'd be a clue about whether the model is avoiding it because it actively recognises the innuendo or if something else is going on.
1 comments

Well then the picks will follow how the numbers are distributed in the training data. More popular numbers will show up more
That's what you'd expect. But we don't know for sure why GPT4.1 chooses 69 only a quarter as often as a random dice roll would. And we don't know if this quirk is reverted by 'uncensoring' a trained model