|
|
|
|
|
by shock-value
1211 days ago
|
|
The advancement with these LLMs lies in the fact that they can effectively learn to recognize patterns within “large-ish” input text sequences and probabilistically generate a likely next word given those patterns. It’s a genuine advancement. However it is still just pattern matching. And describing anything it’s doing as “behavior” is a real stretch given that it is a feed-forward network that does not incorporate any notion of agency, memory, or deliberation into its processing. |
|
The difference is staggering.
It comes about because of the insane level of computational iterations (that are not required for normal statistical completion) mapping vast numbers of terabytes of data into a set of parameters constrained to work together in a way (layers of alternating linear combinations followed by non-linear compressions) that requires functional relationships to be learned in order to compress the information enough to work.
It is a profound difference both in methodology and results.