Hacker News new | ask | show | jobs
by malfist 10 days ago
I mean, if they've consumed all of human knowledge. What's left for them to train on? This pivot isn't only because it's cheaper and a way to juice the numbers for an IPO, it's survival because they can't improve more.
2 comments

IIRC when they make a big enough architecture change to the model they will need to rerun pre training . So not like they’re feeding it more data (they will be but will be a drop in an s3 bucket compared to their dataset reserves) but rather training models with different architectures.
It did sound to me like they feel some sort of wall coming.