Hacker News new | ask | show | jobs
by MPSimmons 805 days ago
Generally, yes, it literally just tries to predict the next token again and again and again.

This model is apparently surprisingly good at chat, even though it is a base model, and will take part it it to some extent. It should be really interesting once it's fine-tuned.