|
|
|
|
|
by xcv123
783 days ago
|
|
I'll refer you back to this 1990 paper by Geoffrey Hinton. Up to you if you want to investigate this further. If you can prove this is wrong then you should publish your result. https://www.cs.toronto.edu/~hinton/absps/AIJmapping.pdf "This 1990 paper demonstrated how neural networks could learn to represent and reason about part-whole hierarchical relationships, using family trees as the example domain. By training on examples of family relations like parent-child and grandparent-grandchild, the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training. This seminal work highlighted that neural networks can go beyond just memorizing training examples, and instead learn abstract representations that enable reasoning and generalization" |
|
First, your summary of the paper is nowhere to be found in the paper so I assume this is your summary. You say "the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training." This paper does not include training details. It delegates it to another paper in [10]. From the details in this paper, it is trained on 100 out of the 104 total relations. However, there are only 12 distinct relations: mother, husband, wife, son, daughter, uncle, aunt, brother, sister, nephew, niece. That means, each relation is seen ~8 times. Now, your claim is "underlying logical patterns and reason about new family tree instances not seen during training" but that's a gross misrepresentation of what is happening here. First, it's given multiple instances of the same tree with different labels. Second, the inputs appear to be the 24 people involved and so you cannot possibly extend this to new tree topologies. Finally, this to me is the money quote of the paper:
> Does it make use of the isomorphism between the two family trees to allow it to encode them more efficiently and to generalize relationships in one family tree by analogy to relationships in the other? If it does all these things, it seems reasonable to say that it is doing inference rather than mere association.
Now, we have to be careful here because inference might be construed as reasoning. Obviously, the model is performing some type of statistical inference where a model has been posited (3 layer neural network) and the output is being trained (presumably - no training details) to minimize classification error through something like KL which is equivalent to MLE and so it is indeed a statistical inference. This model is so simple, you could manually work out the inference by doing a page full of multiplications. I brought this up before, so I'll ask you to specifically address this point. No one claims linear models perform reasoning. Why are you proposing that this 3 layer (read, 3 matrix multiplies) is doing reasoning?