|
|
|
|
|
by moritzwarhier
764 days ago
|
|
I think with LLMs in general, the algorithms are very refined and require lots of research, despite being "simple" in terms of entropy, or an imagined Kolgomorov complexity for defining algorithms. So "simple" is a fuzzy term here, but yes, the entropic complexity is in the data, not the algorithms. Related to the so-called "Bitter lesson". Edit: the sister comment pointed out what I failed to express: RILHF and training are also algorithms, and their applications and implementations are probably much more complex than the code that evaluates a given prompt. So basically, "models" (trained NNs) are also an example for the equivalence of code and data. Fixed data used by code (the trained model) is code in itself, even when it is not directly written by humans or in a human-readable language. Edit edit: don't forget to count the imported maths code :)
but I assume this is not relevant to the "it's just matrix multiplications" overall argument |
|