|
|
|
|
|
by EMIRELADERO
484 days ago
|
|
Very impressive. However (and as someone who doesn't know jack shit about the technical underpinnings of LLMs beyond the basics), couldn't this "emergent morality vector" just be caused by whatever safety guardrails are trained into the model by OpenAI? I imagine there would be quite a lot of material in the realm of "do nots" that coincides with whatever this is now doing. |
|