I think group dynamics comes with a turn taking ambiguity. unlike in one-on-one dialogue that's structurally clean since there's a clear prompt, a clear response, and a clear feedback signal for RLHF.
Sure, messy to implement. But maybe that messiness is the fix. Clean 1-on-1 is exactly why AI learns to flatter — one voice, one signal, no pushback. Group is harder to train but harder to game