Hacker News new | ask | show | jobs
by smodad 951 days ago
Interesting paper. My confidence on the following idea isn't high, but I have an idea I don't think a lot of people agree with me on. The authors pretty convincingly show that transformers don't generalize well. I agree, but I would add that I don't think humans generalize well either, so I don't think showing that transformers don't generalize well precludes them from ultimately being "intelligent" (in some way). In fact, the inability to generalize to new situations is actually a core problem in learning for humans.

For example, think of a college freshmen who knows Algebra and begins Calculus. Even though they have all the fundamental "mental tools" available at their disposal to figure out how to prove that an infinite series converges (or not), I would guess that very few students can actually connect the concepts in the right way to see that. They have to be given examples and be shown how to use the knowledge they gained in algebra and how it applies to infinite series. Is their inability to generalize well evidence that they aren't intelligent?

Perhaps what we're really saying, and the real problem for us, is that these systems aren't intelligent enough to be useful to us. Just like we might say that a very smart student, like a von Neumann, would likely be able to figure out how to solve an infinite series without having seen it before. I think it's reasonable to say "we want transformers to generalize better than the average human."

To that end, I think we'll have to imitate the way that a very smart human solves new problems. For example, the students who are capable of solving an infinite series without having seen the concept before, might use a creative approach of trying different things to see if they can discover the pattern. So, my hunch would be that we will need to provide transformers with bolted-on sub-routines like "generate several hypotheses and test them to see which pattern fits the data best" before we can expect them to generalize well.

Tl;dr: Transformers don't generalize well, but I don't think humans generalize well either, so I don't think that fact precludes their ability to be intelligent in the limit. I think we'll have to imitate how humans generalize by giving the transformers additional "mental tools" before they can generalize like very intelligent humans.