Hacker News new | ask | show | jobs
by minimaxir 1866 days ago
That's mostly a GPT-2/Transformers quirk. Some approaches apply a repetition penalty to work around it.