Hacker News new | ask | show | jobs
by mlucy 2685 days ago
I hadn't read it before! That's a fascinating result, actually. They emphasize interpretability in the paper, but I find it more interesting that you can do so well with only local information.

My first thought is that it makes sense that averaging together a bunch of local predictions would work well on the ImageNet task, since the different classes tend to have obviously different local textures, and class-relevant information makes up a large part of the image. I would be very curious to see if the technique is as competitive for other tasks.

1 comments

Yeah, it seems like it would be useful for debugging to replace some part of the architecture with a simple linear sum and see if it does just about as well?