Hacker News new | ask | show | jobs
by dandelion_lover 2119 days ago
As a theoretical physicist doing computer simulations, I am trying to publish all my code whenever possible. However all my coauthors are against that. They say things like "Someone will take this code and use it without citing us", "Someone will break the code, obtain wrong results and blame us", "Someone will demand support and we do not have time for that", "No one is giving away their tools which make their competitive advantage". This is of course all nonsense, but my arguments are ignored.

If you want to help me (and others who agree with me), please sign this petition: https://publiccode.eu. It demands that all publicly funded code must be public.

P.S. Yes, my 10-year-old code is working.

5 comments

>"Someone will demand support and we do not have time for that",

Well ... that part isn't nonsense, though I agree it shouldn't be a dealbreaker. And it means we should work towards making such support demands minimal or non-existent via easy containerization.

I note with frustration that even the Docker people, whose entire job is containerization, can get this part wrong. I remember when we containerized our startup's app c. 2015, to the point that you should be able to run it locally just by installing docker and running `docker-compose up`, and it still stopped working within a few weeks (which we found when onboarding new employees), which required a knowledgeable person to debug and re-write.

(They changed the spec for docker-compose so that the new version you'd get when downloading Docker would interpret the yaml to mean something else.)

As a theoretical physicist your results should be reproducible based on the content of your papers, where you should detail/state the methods you use. I would make the argument that releasing code in your position has the potential to be scientifically damaging; if another researcher interested in reproducing your results reads your code, then it is possible their reproduction will not be independent. However they will likely still publish it as such.
> "No one is giving away their tools which make their competitive advantage"

This hits close to home. Back in college, I developed software, for a lab, for a project-based class. I put the code up on GitHub under the GPL license (some code I used was licensed under GPL as well), and when the people from the lab found out, they lost their minds. A while later, they submitted a paper and the journal ended up demanding the code they used for analysis. Their solution? They copied and pasted pieces of my project they used for that paper and submitted it as their own work. Of course, they also completely ignored the license.

I’m curious, are dedicated software assurance teams a thing in your research area? Or is quality left up to the primary researchers?
> Or is quality left up to the primary researchers?

Individual researchers, and in many disciplines (like physics), there is almost no emphasis on quality.

I left academia a decade ago, but at the time all except one of my colleagues protested when version control was suggested to them. Some of these have code in the 30-40K lines.

I formerly worked in research, left and am now back in a quasi-research organization.

It’s bit disconcerting seeing how much quality is brushed aside particularly in software. Researchers seem to intuitively grasp how they need quality hardware to do their job, yet software rarely gets the same consideration. I’ve never been able to get many to come around to the idea that software should be treated the same as any other engineered product that enables their research

> protested when version control was suggested

Academics are strange like this. The root reason is fear: fear that you're complicating their process, that you're going to interrupt their productivity or flow state, that you're introducing complication that has no benefit. They then build up a massive case in their minds for why they shouldn't do this; good luck fighting it.

Doubly so if you're IT staff and don't have a PhD. There's a fundamental lack of respect on behalf of (a vocal minority) of academics about bit plumbers, until of course when they need us to do something laughably basic. It's the seeds of elitism; in reality we should be able to work together, each of us understanding our particular domain and working to help the other.

> The root reason is fear: fear that you're complicating their process, that you're going to interrupt their productivity or flow state, that you're introducing complication that has no benefit.

Yes, but how does it compare to all the complicated processes that exist in academic institutions currently? Almost all of which originated from academics themselves, mind you.

It's not that complicated. No one individual process is that bad. The problem is that there's so many that you need to steep in it for ages to pick everything up.

This means it makes most sense to pick up processes that are portable and have longevity. Learning Git is a pretty solid example.

I think this is why industry does better science than academia, at least in any area where there are applications. Generally, they get paid for being right, not just for being published, so they put respect and money into people that help get correct results.
I think this is a much wider problem than just in academia/research. Really any area where software isn't the primary product tends to have fairly lax software standards. I work in the embedded firmware field and best practices are often looked at with skepticism and even derision by the electrical engineers who are often the ones doing the programming^[1].

I think software development as a field is incredibly vast and diverse. Programming is an amazing tool, but it's a tool that requires a lot of knowledge in a lot of different areas.

^[1] This isn't universally true of course, I'm not trying to be insulting here.

"quality" is a subjectit word. Let's be clear what this means:

Individual researchers, and in many disciplines (like physics), there is almost no emphasis on correct results, merely on believable results.

There are a few standardized definitions. The most succinct bring “quality is the adherence to requirements”.

As an example, if your science has the requirement of being replicable (as it should) there are a host of best practices that should flow down to the software development requirements. Not implementing those best practices would be indicative of lower quality

Most of the codes I am developing alone. No one else looks at them ever. My supervisor also develops the code alone and never shows it to anyone (not even members of the group).

In other cases, a couple of other researchers may have a look at my code or continue its development. I worked with 4+ research teams and only saw one professional programmer in one of them helping the development. Never heard about a "dedicated software assurance team".

To clarify, nobody sees the code because they aren't allowed, or nobody ever ask to see it?
The second case. However I am hesitating to ask to look at the code of my supervisor. How would I explain why I need it (if it's not needed for my research)? It's also unlikely user-friendly, so it would take a lot of time to understand anything.
I think you touched on something important. Researchers are most concerned with “getting things working”.

One of my favorite points from the book Clean Code was that professional developers aren’t satisfied with “working code”, they aim to make it maintainable. Which may mean writing it in a way that is more clear and concise than we are used to

> I’m curious, are dedicated software assurance teams a thing in your research area?

Are these a thing in any research area? I've heard of exactly one case of an academic lab (one that was easily 99th+ percentile in terms of funding) hiring one software engineer not directly involved in leading a research effort, and when I tell other academics about this they're somewhat incredulous. (I admittedly have a bit of trouble believing it myself -- I can't imagine the incentive to work for low academic pay in an environment where you're inevitably going to feel a sense of inferiority to first year PhD students who think they're hot shit because they're doing "research".)

>Are these a thing in any research area

I can say there are some that have the explicit intent but it can often fall to the wayside due to cost pressure. For example, government funded research from large organizations (think DoD or NASA) have these quality requirements but they can often be hand-waved away or just plain ignored due to cost concerns

Interestingly each of those arguments also applies to publishing an article describing your work.