| Geez people. Please learn to identify hypes, but most importantly the causes of hypes. LLM is fundamentally information processing technology. Not the path to AGI, not an emerging sentience. The reason "this time" feels so amazing is that the unwashed masses suddenly got access to new information processing technology in a context where their tools been stagnant for decades. Not because it was not possible, but because there was no money in it. To understand my argument and its implications, humor me for a moment and imagine a universe in which everyone was already using linux computers and for a decade now published papers of ML/DL were available for people to use. So there were various crowdsourced indices and models of all sorts, which people incrementally embedded in their information processing workflows. In that universe there would be no room for the delirious reaction we have here. It would be incremental evolution of search, knowledge bases, algorithmic content generation etc. What we have experienced instead in these past decades is information tool starvation. These incrementally improving tools, while nothing but known and ultimately mundane algorithms were not available except within a tiny elite. In fact, people's information processing capability arguably declined as the desktop platforms got downgraded, adtech toxic waste covered the information landscape etc. What is happening now is that a socioeconomic and artificially induced scarcity is now being broken (for reasons that require serious piecing together of events). So while on the surface this hype is as distastful as any illustration of
human lemming behaviour, there is enormous silver lining if we succeed to read its causes. These tools are here, have been here for a while and they can be inserted into our ever growing information processing toolkit. The risks are there to match the opportunities. The biggest risk of them all is precisely what has led to the current situation. Technology not diffusing normally, but controlled by gatekeepers. |
All the way back in the 1950s, big money went into it. Decades of hard work, mathematical models of human language, all manner of study, enormous bilingual corpuses of text with phonetic annotation, programmed in general-knowledge databases, fuzzy reasoning algorithms. The amount of work put into it is quite staggering, in hindsight. I remember the cutting edge in the 1990s - SYSTRAN for example, could with some significant human guidance and a limited context domain, translate technical material sometimes usefully.
All of that work has been rendered moot by deep learning. All of it. A machine can, simply with the correct deep learning algorithm and mass exposure to language plus a few bilingual texts, learn an algorithm for translation. It does so automatically, no verb conjugation algorithms, no general knowledge databases, no expert systems with fuzzy reasoning, no parsers, not like a specifically-designed old-school translator had.
And yet these deep-learning systems are vastly superior to the old school architectures, completely supplanting them a couple years after their development.
It is the same story in many other areas. Chess, Go? They learn to play chess and go better than any AI designed specifically to do so. Image classification? Better than the previous 60 years of work on machine vision, and again, accidentally falls out of it. Speech recognition? An algorithm to write a bad poem? Well, we now have an algorithm to find an algorithm to write you that bad poem, if you want it.
That's the thing. These are algorithms to solve very tricky problems, and we didn't have to discover, find, or otherwise create the algorithm. The machine did it for us. I am not sure I'm communicating it well, but to me that's probably the most significant advance since the computer. It was understood - theoretically - that this was possible for a long time; but personally at least, I assumed it would forever require more data and compute than could be realized.