Hacker News new | ask | show | jobs
by schneems 892 days ago
I had an assignment in the OMSCS course where we had to turn the results of a project into a paper and a presentation. It was eye opening on why so many CS papers are difficult to decipher.

I’m used to writing on the web where the scroll is unlimited and everything is hyperlink able and potentially interactive. Journal papers are limited by length and so was our assignment. I had to cut virtually all helpful explanation needed to reproduce my results which was deeply frustrating. We were implementing an algorithm based on another paper and it was hard because key details were omitted or assumptions not stated. After that exercise I have to think some of it was intentional to get it down to size.

I find most people aren’t good at technical communication and teaching others without a LOT of practice. Even then it requires feedback and iteration to make sure the ideas are communicated well. Forcing people to be more succinct and omit details makes the final product worse to consume. I don’t know how common such limitations are these days, but I do know that the average paper is still out of reach of the average programmer (where it would likely have the most benefit).

1 comments

> Journal papers are limited by length and so was our assignment

I have always thought this was a bit silly and that it creates really weird effects that also decrease readability. An interesting point is that reviewers are not required to read the appendix of works. So everything is required to be in the front matter. This is a bit silly when we do things like research graphics or do generative works and such. You want to include images and samples but then your space is eaten up. What if you want to discuss analysis on those images and explore some? You could easily do this on a blog but you're forced to throw this into the appendix. But then a reviewer can ask a question that's explained there and your work can still get rejected because it isn't in the front matter. Another weird incentive is that people end up padding works to fit page limits. This is because if you turn in a shorter paper reviewers will frequently reject your work the same way your boss might not think you're working if they don't see you at your desk.

We live in the 21st century and we still publish like it's the 15th. Computers gave us the ability to embed images, which is why there are so many more graphs and charts now, and it's not like more pages cost more. So just remove it. Some papers should be only a few pages and there's nothing wrong with that. Some papers should be far larger and there's nothing wrong with that. It's just weird to set these up considering they were likely created under other constraints but momentum continued and we back justify the continued decisions (there is something to be said about readability, but that can just be a reason to reject).

Side note: CS groups typically publish in conferences

Page limits force you to focus. As a researcher, you are often expected to communicate your ideas in 1 page, 3 pages, 10 pages, or 30 pages, for various purposes. If a journal asks for a 10-page paper, you write a 10-page paper. If a conference asks for a 1-page abstract, you write a 1-page abstract. Most people reading a paper are not interested in going through all the details, and those details should usually not be in the main paper.

It's also easier to find reviewers for short papers than for long ones.

Some the issues you mention are specific to CS conferences. Because there is only time for 1-2 rounds of reviews, the reviews focus more on accepting/rejecting the paper and less on clearing any misunderstandings before judging it. Conferences are are also more likely to have one-size-fits-all page limits, while journals often have several catagories of papers with different expectations of length.

> Page limits force you to focus.

This can be solved in better ways, which is, in fact, reviewers. I'm okay with a soft requirement but a standardization is what I'm getting at as being problematic. Some papers are noisy because they should be 3 pages but are 10. Some papers are noisy because they are 10 pages and should be 30. There is no universal rule, and that's what I'm getting at.

> It's also easier to find reviewers for short papers than for long ones.

That's a separate problem that needs to be addressed, but is not easy.

> Some the issues you mention are specific to CS conferences.

Yes, but the author here is CS and we are on a CS focused website. But in general what I said isn't specific to conferences. If conferences are the problem then let's abandon them in favor of good science instead of keeping them around (or turn them into being meetup focused). Certainly the lack of back and forth between authors and reviewers is not a meaningful review process (most author rebuttals are limited to one page and often reviewers are not aligned in critiques). Are we all on the same team (better science) or strictly competing against one another?

If the paper does not fit reasonably within the page limit, you should submit it to another venue. If you can't write a meaningful 10-page version of a 30-page paper, you probably can't give a meaningful 25-minute talk on it either. You should submit it directly to a journal that accepts long papers.

Some conferences also have special tracks for short papers, and some journals publish "letters" instead of or in addition to full-length papers.

> If you can't write a meaningful 10-page version of a 30-page paper, you probably can't give a meaningful 25-minute talk on it either.

I can't really tell what's going on here anymore but I don't think we're having a conversation. You're just describing something that's not in good faith here. You're letting "meaningful" do the heavy lifting here. Yes, of course everyone can distill a paper, but not every paper can be distilled and then accepted into publication. Frankly, because reviewers act like exactly this and place weird arbitrary bars on what it means to be good work forgetting that all works are incomplete and thus encouraging embellishing and lying and setting continually new absurd bars.

Stop doing gymnastics to protect a system or just respond to my actual critiques. There's no perfect system so you can even say my critiques are valid yet not enough of a concern to abandon or modify our current system. It's not an all or nothing situation here. But I don't need to be lectured on something this silly as "if you can't do it in 10 pages, you aren't doing it right." My claim was that there isn't a one size fits all standard. I stand buy that. You can respond to what I wrote but there's not a good "teaching moment" here.

If you just want to tell me how I'm wrong without listening to my actual concern then don't comment. You're creating noise and just an angrier internet. If you think I have failed to consider something and that thing is important, do lay it out. But communicate what that actually is rather than just saying "dumb." Give a real critique. The same goes for when you review. Don't be reviewer 2. Reviewer 2 just holds back science.

My point was that if a paper needs 30 pages, don't submit it to a conference. That makes as little sense as submitting an algorithms paper to a zoology journal. Conferences are centered around talks, and you can't present a long paper adequately in a short talk.

Journals can be more flexible than conferences. They don't need page limits, because they don't have the physical constraints imposed by conference dates and the number of parallel tracks. But journals also have audiences, and audience expectations are more important than your paper. You should take those expectations into account when choosing the journal. Don't send an algorithms paper to a zoology journal, and don't send a long paper to a journal that focuses on short papers.

Write the paper first, and then choose a journal or a conference that publishes papers like that. Just as there is no single page limit that fits all papers, there is no single venue that publishes papers of all lengths.

I think desirability of page limits is very subject specific. Some people will just waffle if you don't give them a page limit. Other times it means there's not room for the technical details.
But the reviewers can reject if it isn't enough or reject if it is too much. What I'm arguing is the alignment mechanism already exists. The page limit is over constraining
Distill.pub was one effort to modernize publishing in CS. Chris Olah wrote some thoughts [1] about why he didn’t feel it was tenable. Seems like the primary challenge was the additional effort and skill involved in crafting rich-content/interactive material.

[1] https://distill.pub/2021/distill-hiatus/

Honestly, I don't get why we don't just submit to OpenReview and call it a day. Paper is visible and distributed. There are comment sections where peer review can not just happen, but happen in the open (added bonus!). You can iterate and even see the difference between submissions. What is the conference/journal providing that isn't covered here? A stamp of approval? From a well known noisy system that creates other disincentives?
Not sure the openness of the review would solve so many problems of the system. For example would not touch like reproducibility and data and code availability.

Then you will need moderation (or do you imagine that things will be civilized between people on the internet?) and would need to manage various possibilities of bullying/targeting/etc. Of course these things can happen now, but difference would be between a potentially fully automated and simple system and something very clunky (be friends with an editor, convince him to report who are the reviewers, manage to recognize another of his papers, etc.)

> For example would not touch like reproducibility and data and code availability.

These are different issues, which are certainly important. But I do think in some way this would help. OpenReview does allow you to post comments many months after. Effectively think about this as a GitHub issues page. It certainly could be organized better but it is better than what exists now. OR also has links for code and community implementations (as does arxiv now). Here's an example that has all these things[0]. Granted data is missing, but I don't see why this can't also be integrated, but would need to also push cultural norms.

> Then you will need moderation

I think OR has this a bit solved, similarly arxiv. They are not anonymous accounts and are tied to your ORCID record. Arxiv requires you to have a verifier that is already someone with an arxiv account. Yes, this can be abused, but it is also an easier moderation problem that say Reddit or HN even. I think if you're posting bullying comments under a named profile, then it is good that that is visible so others can see. Mind you, bullying does already exist but it is just behind closed doors. It is worse now because only the Area Chair can take action and often they are over worked and works do get dismissed (which results in A LOT of wasted time, and money) because of this bullying. The larger the field, the more noise too and the more this happens. It is just far less common to see people bullying in public than behind closed doors.

I must stress though, that there is no perfect system here. There is no system that can make the amount of bullying 0. So we have to be careful in our critiques because there will always be valid critiques that are in fact of concern (like this one) but are fundamentally unsolvable. The question then becomes if we improve upon the existing frameworks and if whatever costs have been made are worth the added benefits. So I just want to make sure that this idea isn't killed because an impossible bar, despite the critique being valid.

Edit: I'd actually add that this system encourages reproduction. Because if we still measure on citations and number of publications this means that reproduction works can still count towards those metrics and thus someone's career advancements. The whole conference/journal system currently discourages such effort in favor of the absurdly nebulous novelty concept (which also makes papers noisy). My proposal would also allow for the publication of failures, which is also an important thing for academics.

[0] https://openreview.net/forum?id=Hkxzx0NtDB

Promotion and filtering I guess? What does a record label provide when you can just upload music to Spotify?
> What does a record label provide when you can just upload music to Spotify?

I believe this is an illustrative example in support of my proposition, not against. Many artists are in fact turning away from record labels in favor of self publishing. Similarly for books.

But I will say that I still think there's value and so I'll expand on my ideas about conferences. I think they should exist, but be focused on meet and greets. So instead of being an indicator of the validity of work, have them invite authors to speak about their works. Allow others to sign up for poster sessions. How to do that appropriately does need to be worked out, but there's nothing wrong with it simply being under recommendation from the advisement of the organizing members. Yes, there will still be preferential bias, but I do mean "still" because we do have preferential biases towards certain institutions and labs. This would just make it a bit more explicit that they are not the arbitrators of quality but just treated as a "reward."

Importantly I think this allows opening the doors for different kinds of research that are not incentivized by our systems. Most important being reproduction