Hacker News new | ask | show | jobs
by chunky1994 619 days ago
If you train one of the larger models on these specific problems (i.e DM for D&D problems) it probably will surprise you. The larger models are great at generic text production but when fine-tuned for specific people/task emulation they're quite surprisingly good.
2 comments

Are there models that haven't been RLHF'd to the point of sycophancy that are good for this? I find that the models are so keen to affirm, they'll generally write a continuation where any plan the PCs propose works out somehow, no matter what it is.
Doesn't seem impossible to fix either way. You could have like a preliminary step where a conventional algorithm decides if a proposal will work at random, with the probability depending on some variable, before handing it out to the DM AI. "The player says they want to do this: <proposed course of action>. This will not work. Explain why."
For story settings and non essential NPC characters, yes. They might make some interesting side characters.

But they still fail at things like puzzles.