Hacker News new | ask | show | jobs
by p1esk 2993 days ago
I also work in AI/ML field (deep learning), and usually I don't care if the paper has corresponding code or not. I read papers to find good ideas. If I find it, I can implement it myself. I rarely need more than a couple of days to test an idea (e.g. Hinton's capsules model took 4-5 hours to implement). The benefits of own implementation should be obvious.

If something important is missing or does not make sense, I usually just email the first author. Usually they respond within a couple of days, and unlike looking at code, I can also get an explanation of why they did it that way.

In fact, I don't even usually care that much about stated results (such as improvements in state of the art).

Things that matter are: deep insight into a problem, new angle to look at something, discovery of a new phenomenon, high quality explanation, practical tricks to save resources, and comprehensive prior/related work review. That's why I read papers.

3 comments

this is the right way to go about things if you have certain goals, for sure.

sometimes you need to replicate exactly the same training method, on exactly the same data — for instance if you want to use it as a baseline on a known dataset. then it becomes really important to have the code, because while an adequate replication might be easy, it takes a lot of trial and error to get perfectly the same model.

Sure, but if the code for some result is not available, I feel free to report whatever result I got implementing their method. I’m also perfectly fine with using “couldn’t reproduce” phrase in my papers.
You seem to be exceptionally well funded, and/or have few deadline constraints. Your strategy will only work until you get spammed with "good ideas".
You seem to be exceptionally well funded, and/or have few deadline constraints

I wish! :)

you get spammed with "good ideas"

Again, I wish!

In the subfield I'm focused on at the moment (efficient mapping of NN algorithms to specialized hardware, low precision computation, model compression) I don't see good ideas very often (fewer than one good paper a week). Previously I worked on music generation - also didn't really feel spammed with good ideas.

I don't mean this to be adversarial, but what exactly is it you do that would not be sped up by checking someone else's results directly before fiddling around and then trying out your own implementation?
But that's my point: their results are not that important to me.

As an example, recently I saw a paper on NN weight quantization, which had a very interesting idea, but the results were not impressive. I don't remember if they had any code published or not, but it didn't matter - I wanted to see what kind of results I'd get if I implemented it. Turned out it works really well, much better than what they reported in the paper.

Here is an idea: inverse dropout.

How would you implement that?

Link to paper?
What your preferred software to implement these? A framework like chainer, or purely in numpy/MATLAB?
Tensorflow or Pytorch. Plain Numpy for quick prototyping/testing. Sometimes have to write/modify Cuda kernels.