|
|
|
|
|
by thrill
9 days ago
|
|
"even a small jailbreak should cause them to pull back and fix it first, right" You do realize that LLMs are summarizations of vast numbers of weights, don't you? You don't "fix" a weight and suddenly everything is alright. You can only probe constantly in a vast space and see if the results you can command matter or not. |
|