|
|
|
|
|
by anematode
8 days ago
|
|
Legitimate criticism of the author's presentation aside, I'm quite disappointed by how many commenters here are justifying the model's output. I guess there's a lot of misanthropy and nihilism here? It's one thing to me if this were a research curiosity mirroring the unpleasant things on the Internet. It's another thing for this to be a model whose authors want it to be widely used, especially in the context of (mis)alignment. Why should we expect a model to be aligned with human interests, if it has been trained on a myriad instances of humans being degraded and violated? |
|
Understanding more about what exists in the real world, outside of its pile of weights, is separate from alignment. If an AI model learns that it is possible for a house to burn down. That doesn't mean an AI will want to burn down a house.