|
|
|
|
|
by CSSer
372 days ago
|
|
It doesn't help that thanks to RLHF, every time a good example of this gains popularity, e.g. "How many Rs are in 'strawberry'?", it's often snuffed out quickly. If I worked at a company with an LLM product, I'd build tooling to look for these kinds of examples in social media or directly in usage data so they can be prioritized for fixes. I don't know how to feel about this. On the one hand, it's sort of like red teaming. On the other hand, it clearly gives consumers a false sense of ability. |
|