Hacker News new | ask | show | jobs
by TheEzEzz 682 days ago
You're basically taking the model "off policy" when you bias the decoder, which can definitely make weird things happen.