|
|
|
|
|
by Athari
946 days ago
|
|
I don't consider Anthropic's approach to safety fantastic. They train the model to lie, play cat and mouse with jailbreakers, run moderation on generations with delay etc. This makes the model appear safer, as it's harder to jailbreak, but this approach solves nothing fundamentally. If Ilya is concerned about safety and alignment, he probably has a better chance to get there with OpenAI, now the he has more control over it. |
|