|
|
|
|
|
by meander_water
30 days ago
|
|
> the model has its own emergent guardrails that sometimes cause it to push back on legitimate security research requests. But as we found, these organic refusals aren’t consistent - the same task, framed differently or presented in a different context, could produce completely different outcomes as illustrated in the examples below. This was new. I'm surprised that a model specifically designed for security research and gated to professionals is refusing legitimate requests |
|