Hacker News new | ask | show | jobs
by t-vi 1339 days ago
Indeed, the "learning". To my mind, the most simple (but still speculative) explanation of the "learning" phenomena - working examples and limitations / failures - we see is that the large models implicitly memorize the training inputs (or some derived features that can be used to approximately reconstruct the inputs) and then do something between interpolation and rather simple non-parametric learning. The effect is outputs are basically a somewhat sensical agglomeration of copy-pasted" snippets.

That said I think the results are often useful and sometimes fascinating. We should not fool ourselves about the learning that these large neural nets do, though.