Hacker News new | ask | show | jobs
by dougmwne 1164 days ago
This whole conversation about training set size is bizarre. No one ever asks what’s in the training set. Why would a trillion tokens of mundane gossip improve a LLMs ability to do anything valuable at all?

If a scrape of the general internet, scientific papers and books isn’t enough, a trillion trillion trillion text messages to mom aren’t going to change matters.