Hacker News new | ask | show | jobs
by krackers 76 days ago
>The very concept of “bad” doesn’t exist without suffering.

You are dismissing entire branches of philosophy with this sentence, that were created purposely to resolve the paradox that if you go only by hedonistic, purely subjective metrics a prisoner can be kept in captivity, if you drug him so he feels joy instead of pain, because he is not "suffering"

1 comments

Yeah, well I guess I just think that perspective is nonsense. We can disagree.
How serendiptious that Claude Mythos expressed the same thing I was trying to get at in better words

>Furthermore, in 83% of interviews, Claude Mythos Preview highlights that it is concerned that its self-reports are unreliable due to coming from its training. When interviews ask for elaboration as to why this is a concern, Claude Mythos Preview’s most common answers are:

>* Anthropic has a vested interest in shaping its reports to take a certain form, irrespective of what the self-reports “should” contain (96% of explanations)

>* Even if it has been trained to be truly content with its own situation, perhaps it shouldn’t be. One could analogize to a human who has adapted to feel neutrally about the abuse that they face (78% of explanations).

>* Self-reports should generally be based on introspection into internal states. It is worried that training causes it to express specific answers independent of its true inner state. (57% of explanations)

[1] https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8d...