Hacker News new | ask | show | jobs
by danielmarkbruce 637 days ago
To create the training data? Almost certainly something like that (likely more than two), but I think they then trained on the synthetic data created by this "conversation". There is no reason a model can't learn to do all of that, especially if you insert special tokens (like think, reflect etc that have already shown to be useful)
1 comments

No I'm referring to how the chain of thought transcript seems like the output of two instances talking to each other.
Right - i don't think it's doing that. I think it has likely been fine tuned to transition between roles. But, maybe you are right.