|
|
|
|
|
by mitthrowaway2
1189 days ago
|
|
I think the holy grail of safety research is widely understood to be a recipe for creating a friendly AGI (or, perhaps, a proof that dangerous AGI cannot be made, but that seems even more unlikely). Asking for a conservative lower bound is more like "at least prove that this LLM, which has finite memory and can only answer queries, is not capable of devising and executing a plan to kill all humans", and that turns out to be more difficult than you'd think even though it's not an AGI. |
|