Hacker News new | ask | show | jobs
by throwaway2214 1196 days ago
you assume they just reproduces the training set, but when they get big enough they start to "understand" things, when the input is really big it actually can never "guess" the next word in the batch, so it has to "learn" concepts