Hacker News new | ask | show | jobs
by felippee 2891 days ago
Yes it is certainly not fair that the network they spend one page explaining and probably weeks training and researching can be hardwired in 30 lines of python. This is very unfair. But this is the reality, and so the post states.

Also the idea to add coordinate as a feature has been used in the past without giving even much thought.

Toy examples are great. As long as they are not trivial. Some guy, presumably smart, once said that "things should be as simple as possible but not simpler". The toy example they play with is just too simple.

2 comments

I highly doubt they spent weeks training on the toy example. More like five minutes, probably. Again, that the weights learned (quickly) for the toy example can also be set by hand is not surprising and is evidence of a good toy example. The paper’s main result is not the toy example, but the real experiments (for which I doubt you could hand-code the network weights).
The interesting part is that this trivial toy problem is hard to learn for a standard CNN.

They probably engineered the toy problem to be that simple, looking for the simplest problem that still displays the phenomenon.

This may indeed be interesting, but that is not what this paper focuses on.
From the abstract:

"For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. Although convolutional networks would seem appropriate for this task, we show that they fail spectacularly. We demonstrate and carefully analyze the failure first on a toy problem, at which point a simple fix becomes obvious."

https://arxiv.org/abs/1807.03247