Hacker News new | ask | show | jobs
by bjourne 2993 days ago
I've had a paper peer reviewed. It was ultimately rejected but I can't help but suspect that by making all my code publicly available, I hurt my chances of publication. The reviewers comments were about my coding style, my choice of build tool (I didn't use make, but something else which is just as easy to use), the choice of C vs C++...

It's like best practices for computer security -- always strive to minimize the attack surface. :) Without source code there is much less stuff to criticize!

11 comments

>The reviewers comments were about my coding style, my choice of build tool (I didn't use make, but something else which is just as easy to use), the choice of C vs C++...

I don't know precisely what field you submitted in, but this is maliciously bad reviewing practice. You should have submitted a rebuttal and written to the editor calling the relevance of such "reviews" into question.

Yeah, just reading about that made my blood boil. Especially if it was a biology paper, I don't know what I would have done...
> It's like best practices for computer security -- always strive to minimize the attack surface.

I suspect that's also why some papers are unnecesarely verbose and describe simple things as complicated as possible. Can't criticize something that can't be understood.

It's unfair that your comment is downvoted because it's spot on.

Hiding code, obfuscating language, fudging data, all are symptoms of the same problem: of being interested in getting paper on cv instead of doing research.

There are many circumstances that can put even a good scientist in a situation where he/she has to do this but that's not a good argument for not sharing the code.

Then why submit it for peer review at all?
Because we need peer reviewed papers on our CVs!

I also detest simple things made complex, though. In my experience (with has covered electronics, epidemiology and geography) reviewers tend to pick up on obtuse issues in text but miss glaring errors in the math. It's sad, and you can see why someone less than scrupulous would exploit that tendency by over complicating things. That said I think plenty of authors are honest but just not very clear thinkers!

In machine learning / computer vision people often release their code after the paper is already accepted.

Time before the submission deadline is usually used to do more experiments and write text, not to polish the code. And after the deadline there is no hurry. What people (who want to share code) consider important is to release it till a bit before the actual conference (but this doesn't transfer to journal-based fields).

The paper should be accepted in rough form, but publication should be held up pending approval of the data and code.

Or a paper should be published in a probationary form, and not certified (by the journal) until an independent lab replicates the result. A paper that isn't making adequate progress toward replication should be retracted by the publishing journal.

That's terribly unfortunate.

I'm entirely open to being shown otherwise, but working in that field seems more akin to working in physics than in software engineering (those are engineering problems, after all, not computer science; and sometimes they're only opinions)— and being critiqued for that in an ML/AI paper would be like critiquing a physics paper over the author's coding style— it is misdirected, IMO.

They could be squashing some legitimately good work by being too heavy handed around coding style and build process.

This is why artifact reviews should be separated from publication review. Artifact reviews are notoriously horrible, with reviewers inexperienced in it simply bike shedding and picking at straws. At best, artifact reviews should simply be checking for reproducibility (i.e. can they get the code to run at all).
I'm very surprised to hear this: I've submitted several artefacts; co-run an AEC for a (small-medium) conference; and spoken to a lot of people about it. I've heard virtually nothing negative until your post. Indeed, the artefact reviews I've received have nearly all been thorough and considered (one review was slightly nitpicky, but that's one review out of 10-12). For paper reviews, on the other hand, I'm very happy if 1/2 of reviews are thorough and considered. Bear in mind that most of the artefact reviewers also implicitly review the paper, and you get some idea of how good a job they do.

My main bugbear with the whole thing is the incorrect spelling of "artefact". And when that's my main bugbear... well, things aren't too bad!

I recently reviewed a computational materials science paper and was quite impressed by the fact that they included some data and a Jupyter notebook. Long term, ecosystem will be an issue, but in the short term, it's invaluable. It does make it easier to check for obvious errors. I think more incentives should be given by funding agencies to encourage this.

I'm really sorry to hear about your experience.

I don't work primarily in computer science, but rather in math/physics, and both as a reviewer and as an author, I have only seen a positive impact for sharing code. When I review, if code is made available, it is easy for me to see the details of a model or a calculation, which I really appreciate. When I am writing a paper and developing a model, knowing that I will make my code available ensures that I write things in a clear, transferable, and understandable way (which ultimately ends up being quite beneficial to me).
Then do we even need peer review? In my experience it is always superficial, people just feel that they have to say something, so they say something about writing style, or similar trivia.

The way it should work is you put your stuff with code and all data on github. People interested in the field or working for journals read it, and rate it, journals collect links to paper repositories that are highly rated by scientists who have many highly rated papers in the field, and call that publication.

I've been peer reviewed once (and waiting for the second) and it was very in depth, giving me a couple of pointers to improve my paper. Field was mathematics, though.
sure it depends on field, on journal and on reviewer, and but with a github like interface, and public reviews it will only get better.
Problem is, if the review is public then it means the article is also in the public and some (most) publishers are not OK with that. At least yet, hopefully it will get better.
I don't know what field you work on but this would be very atypical in mine.

While it's true that minimizing the attack surface is something that can work in papers, in my field reviewers typically don't look at the code. Many of my papers include code or links to it, and I haven't ever had a comment about it in reviews.

Aaah, brought me memories from the 7 years I was in academia. All the publish and perish and the peer reviewing process is completely broken. Academia is completely broken, I would hate having to go back to academia now that I have been 7 years in industry and earning 6 figures.
Nah, they probably used your code to scoop you.