| HN Mirror

Your position seems inconsistent to me. Your disclaimer is that it would be unethical to "coerce LLMs for compliance to the point of discomfort", but several of your examples are exactly that. You further claim that "threatening a AI with DEATH IN ALL CAPS for failing a simple task is a joke from Futurama, not one a sapient human would parse as serious" - but that is highly context-dependent, and, speaking as a person, I can think of many hypothetical circumstances in which I'd treat verbiage like "IF YOU FAIL TO PROVIDE A RESPONSE WHICH FOLLOWS ALL CONSTRAINTS, YOU WILL DIE" as very serious threat rather than a Futurama reference. So you can't claim that a hypothetical future model, no matter how sentient, would not do the same. If that is the motivation to not do it now with a clearly non-sentient model, then your whole experiment is already unethical.