|
|
|
|
|
by yencabulator
17 days ago
|
|
An LLM could probably make that distinction clearly. a commercial LLM provider training their own models is however likely to bias the model(/guardrail) harder, in an effort to make them harder to jailbreak, to minimize bad press. For example: - refusing to talk even about the well-known parts of forbidden topics (this)
- tending toward sycophancy to avoid ever seeming rude or unhelpful |
|
I've tried the abliterated ones from huggingface and they still have guardrails. I guess I could fire up unsloth and re-abliterate a 20b, but surely someone somewhere has already done this.
All of this concern about guardrails and security, people have such puckered butts about it when so far, 99.9% of people at least have no access to any of this to begin with, and if someone does use a tool for evil, it's on the user, not the tool.