| How much of human written history can be compressed and aproximately stored in 504Bn parameters? It seems to me bascially certain that no compressed representation of text can be an understanding of langugae, so necessarily, any statistical algorithm here is always using coincidental tricks. That it takes 500bn parameters to do it, i think, is a clue that we dont even really need. Words mean what we do with them -- you need to be here in the world with us, to understand what we mean. There is nothing in the patterns of our usage of words which provides their semantics, so the whole field of distributional analysis precludes this superstition. You cannot, by mere statistical analysis of patterns in mere text, understand the nature of the world. But it is precisely this we communicate in text. We succeed because we are both in the world, not because "w" occuring before "d" somehow communicates anything. Apparent correlations in text are meaningful to us, because we created them, and we have their semantics. The system must by is nature be a mere remebering. |
I think your premise contains your conclusion, which while common, is something you should strive to avoid.
I do think your opinion is a good example of the prevailing sentiment on Hacker News. To me, it seems to come from a discomfort with the fact that even "we" emerge out of the basic interactions of basic building blocks. Our brain has been able to build world knowledge "merely by" analysis of electrical impulses being transmitted to it on wires.