Hacker News new | ask | show | jobs
by hervature 784 days ago
You've cited this paper multiple times in this thread. I'll go through the exercise of telling you why I do not think this paper shows anything. Hopefully, you will then address one of the many points I've given as counterpoints.

First, your summary of the paper is nowhere to be found in the paper so I assume this is your summary. You say "the neural network was able to capture the underlying logical patterns and reason about new family tree instances not seen during training." This paper does not include training details. It delegates it to another paper in [10]. From the details in this paper, it is trained on 100 out of the 104 total relations. However, there are only 12 distinct relations: mother, husband, wife, son, daughter, uncle, aunt, brother, sister, nephew, niece. That means, each relation is seen ~8 times. Now, your claim is "underlying logical patterns and reason about new family tree instances not seen during training" but that's a gross misrepresentation of what is happening here. First, it's given multiple instances of the same tree with different labels. Second, the inputs appear to be the 24 people involved and so you cannot possibly extend this to new tree topologies. Finally, this to me is the money quote of the paper:

> Does it make use of the isomorphism between the two family trees to allow it to encode them more efficiently and to generalize relationships in one family tree by analogy to relationships in the other? If it does all these things, it seems reasonable to say that it is doing inference rather than mere association.

Now, we have to be careful here because inference might be construed as reasoning. Obviously, the model is performing some type of statistical inference where a model has been posited (3 layer neural network) and the output is being trained (presumably - no training details) to minimize classification error through something like KL which is equivalent to MLE and so it is indeed a statistical inference. This model is so simple, you could manually work out the inference by doing a page full of multiplications. I brought this up before, so I'll ask you to specifically address this point. No one claims linear models perform reasoning. Why are you proposing that this 3 layer (read, 3 matrix multiplies) is doing reasoning?

1 comments

Yes it's not my summary. I originally learned about the family tree example from a lecture by Geoffrey Hinton. Found some lecture slides here which reference the example but can't find the original lecture right now.

https://www.cs.toronto.edu/~hinton/coursera/lecture4/lec4.pd...

> No one claims linear models perform reasoning. Why are you proposing that this 3 layer (read, 3 matrix multiplies) is doing reasoning?

A 3 layer neural network is a non-linear function. It is not a linear model. There are activation functions between the layers which make it non-linear.