Hacker News new | ask | show | jobs
by novaRom 1095 days ago
Important addition to your partially right statement: "they’re trained to generate ‘likely’ text" is they are trained to produce most probable next word so that the current context look as "similar" to training data as possible. Where "similar" is not "equal".