Hacker News new | ask | show | jobs
by phreeza 1820 days ago
I get your frustrations with this state of affairs, but for the reasons I mentioned above, I don't think providing the model and code is a panacea here. Maybe the last few years have also set an unrealistic expectation for the pace of progress. In my (former) field of theoretical neuroscience, if a paper was not reproducible, this knowledge kind of slowly diffused through the community, mostly through informal conversations with people who tried to reproduce or extend a given approach. But this takes several years, not the kind of timescale that modern ML research operates on.

Fwiw I think actual knowledge is there in the ML literature, but it's not in these Benchmark-chasing highly tuned papers. It's more high level stuff, like basic architecture building blocks etc. GANs and Transformers for example. They undeniably work, and the knowledge needed to implement them can probably be conveyed in a few pages maximum. No need for an implementation to be provided by the author, really.

2 comments

I have no particular expertise here, but I wonder if you've learned to accept a mostly-broken process? We have the Internet, so why settle for slow diffusion over years instead of rapid communication?

Why should graduate students have to spend years trying to reproduce stuff that turns out to be no good? Nobody should have to put up with getting their time wasted like that.

I think it is a social problem, not a technical one. A healthy research field should have some level of cooperation between participants. If you go ahead and publish a "this does not reproduce" paper, you can easily ruin someone's career, so in most cases you don't. I know this is not the platonic ideal of science, but it is the reality, especially of smaller research communities. I agree this is not ideal, but not sure if I would call the process broken though.
This concern over ruining someone’s career itself seems like a symptom of a broken process? Making it safe to openly discuss failures is important.

In at least some big companies in the private sector we have “blameless postmortems” where we describe what went wrong in an operational failure without blaming the participating employees.

Sure having blameless postmortems would be amazing, but I think the informal process I described is probably the closest you will get to it. The reason being that any given subfield can't make this decision in isolation, because the people to who it matters (funding agencies, faculty search committees, journal editors to a degree) are not part of the field, and when they see such a 'blameless postmortem' they will think 'whoa, this person really messed up, we'd better stay away'.

Maybe I am wrong though, and a better culture is possible, like the shift to preprints has happened in a lot of fields and was probably previously unthinkable. So good on you for taking an idealistic stance, I am probably just being grumpy. That being said, whatever culture changes may be beneficial, I stand by my original point that simply dumping code and model alongside the paper is not unambiguously good and may even obscure problems.

To be honest I don't think the StyleGAN papers are benchmark chasing. If you read StyleGAN[0], StyleGAN2[1], StyleGAN2-ADA[2], and this paper there is a clear story. They call out the mistakes in the previous papers and resolve them. The papers themselves even admit to where they fall short. But it is research. Problems don't get solved all at once. But if you pay attention to these 4 papers it is very clear Kerras has a very well defined research focus and direction. He's showing his progress over time, sharing it with the community, and learning from the community as well. This is how research should happen.

[0] https://arxiv.org/abs/1812.04948

[1] https://arxiv.org/abs/1912.04958

[2] https://arxiv.org/abs/2006.06676

Good point, I wanted to make a more general point about the state of ML literature, I am not super familiar with this particular sequence of papers.