Hacker News new | ask | show | jobs
by Tostino 344 days ago
When trained on next word prediction with the standard loss function, by definition it is it's only job.