|
|
|
|
|
by tucnak
626 days ago
|
|
I'm led to believe this is mostly because "known unknowns" are not well-represented in the training datasets... I think, instead of bothering with refusals and enforcing a particular "voice" with excessive RL, they ought to focus more on identifying "gaps" in the datasets and feeding them back, perhaps they're already doing this with synthetic data / distillation. |
|