Hacker News new | ask | show | jobs
by TimPC 915 days ago
I think it's less about scalability and more about identifying specific choices that are promising. A lot of Schmidthuber's work paints out broad ideas in grand strokes and suggests hundreds of potential neural networks without evaluating what choices in that massive space are good. He then claims credit when other people identify the specific one or two of those hundred models (often with minor variations that make it not immediately obvious whether it perfectly fits the broad definition or not) that are actually promising.
1 comments

But if other people identify and further develop one or two out of his models, he should still be cited. If they did not use his work at all, coming ith one or two models independently, then that's a different situation. It's a bit of an honor code thing as well, it's hard to prove if somebody has a read a paper or not. But then there's a more stringent standard where one cites as part of surveying preexisting work, which can result in a vast list.
Let's take an example: his "unnormalised linear Transformer," a neural network with "self-attention" published in 1992 under another name. It wasn't just an idea, it was implemented and tested in experiments. However, few people cared, because the computational hardware was so slow, and such networks were not yet practical. Decades later, the basic concepts were "reinvented" and renamed, and today they are really useful on much faster computers. Unquestionably, however, their origins must be cited by those who are applying them.

Why are some people here even debating the generally recognised rules of scientific publishing mentioned in the paper:

> The deontology of science requires: If one "reinvents" something that was already known, and only becomes aware of it later, one must at least clarify it later, and correctly give credit in all follow-up papers and presentations.