Hacker News new | ask | show | jobs
by nikhil896 929 days ago
My counter-point to this is that babies are born with a sort of basic pre-trained LLM. Humans are born with our analogical weights & biases in our brains partly optimized to learn language, math, etc. Before pre-training an LLM, the weights & biases of their analogical brain are initialized with random values. Training on the internet can IMO be seen as a kind of "pre-training"