|
|
|
|
|
by moyix
1596 days ago
|
|
Invent is probably the wrong word since it implies agency, sure. Maybe "discover" or "luck upon", since whatever it's doing was formed by updating a pile of floating point weights with gradient descent? I think it certainly makes sense to ask what the higher level "algorithm" at work here is, though. Electrons flow through wires and transistors in (say) an adder [1]; looking at the wires and transistors you won't see an algorithm for addition, but there is certainly one present, codified in the arrangement of those wires and transistors. But maybe we can reverse engineer whatever the LM is doing by a combination of probing it with experiments like these and (maybe) inspecting the learned weights. The Curve Circuits paper did this for reverse engineering a curve detector learned by a convolutional neural network: https://distill.pub/2020/circuits/curve-circuits/ I also don't mean to imply that it's a good algorithm, or one that generalizes to arbitrary numbers, etc. Maybe it's just (effectively) a lookup table and some special cases! [1] Please don't yell at me for this metaphor, I bailed out of physics after scraping out a B- in E&M ;) |
|
I'm fine with "invent" actually, despite the implication of agency (I'm used to the terminology "predicate invention" [1]; although maybe I should actually re-examine the motivation behind it).
I'm more interested in the representation issue. I had a look at the quoted article on CNNs earlier. I think there is a very fine line between claiming that a CNN's weights represent an algorithm and that its weights can be _interpreted_ as an algorithm. I feel that the article leans too heavily on the interpretation side and doesn't make enough of an effort to show that the CNNs weight really represent an algorithm, rather than having activations in subsequent layers and therefore with a natural ordering.
In any case, I would like to understand how a language model can represent an algorithm.
_____________
[1] https://link.springer.com/referenceworkentry/10.1007/978-0-3...