Hacker News new | ask | show | jobs
by mattkrause 2652 days ago
Along with the personal cost, this instability can't be great for producing good science either. My lab has learned and lost some techniques over and over again, as people churn through.

If it were up to me, I'd convert some MS/PhD slots into staff scientist roles with longer contracts. I think you could probably do this in a way that increases productivity, and makes more people more happy to boot.

1 comments

One of my friends who is a graduate student complains about the difficulty of collaborating with other scientists who lack strong coding skills. They are so focused on the science that there's no time to learn or practice good code hygiene.

If often wondered why labs don't hire regular developers to increase research veliocity. I have no interest in research, but would happily work in the context of academia doing thing like handling merges, ensuring code modularity, maintaining infrastructure, writing unit tests, etc.

This has become a little more common lately, but probably still not as common as it should be.

Part of the problem is funding. You can get 3-4 grad students or ~2 postdocs for the price of a developer. Plus there are lots of existing mechanisms for funding them: training grants, internal and external fellowships, working as a TA, etc. A developer would have to get paid out of research funding, which is already pretty limited. The National Cancer Institute had a program for staff scientists, and the Chan-Zuckerberg Initiative just launched one targeting microscopy, but there aren’t tons of options, especially not for open-ended roles.

There’s been some adverse selection too. I briefly had a programmer but it rapidly became obvious he was working for an academic salary because no one else in their right m8ns would pay him more. I ended up rewriting all of that code, and despite this, my boss keeps sending me fresh-faced undergrads “to do the coding.” I guess the idea that you get what you pay for hasn’t sunk in yet. That said, we’ve also had a few that were excellent and were interested in the projects; I think they were both hired as part of some complicated arrangement where their spouses were recruited for more traditional academic roles.

Another part of it is that code quality hasn’t been a huge priority. That is finally starting to change, but most labs have a lot of code that was unceremoniously promoted from “one-off prototype” to “critical infrastructure” without too many changes. (This, incidentally, gives the lie to peoples’ obsession with YAGNI).

Finally, if anyone does need a neuro/ML themed developer, I call dibs :-) Seriously though, I completely agree that we have have more specialized roles (dev, technical writer/editor) and I think that whatever place manages to make this work could become a research powerhouse. Some of the bigger institutes (the Broad, Janlia Farms, etc) do have some jobs like this already.

Some labs do this! I've seen places where the ratio of engineers to grad students and postdocs was about 1:1. The engineers were salaried and could take courses at the university if they chose to (for free, but on their own time).
Tell me more, please! (Especially if they’re hiring)
A common thread was that these were labs that were joint between the university and a UARC or FFRDC. Academic grant funding doesn't usually include engineers, only PI and students, but FFRDC/UARC funding is different. So look at Johns Hopkins/APL, Penn State/ARL, Georgia Tech/GTRI, then (very importantly) look for labs run by a PI with a faculty appointment at the university.

A friend of mine worked at Hopkins for a few years, did interesting stuff, and walked out with a (free) MS afterwards.

Ah, those are a bit of a special case.

I visited APL a while back and loved it. The job didn’t work out (federal stuff has been....turbulent lately), but hopefully it’ll work out one day.

Thanks for the pointers!

I really do think lab composition's tie to grant wording (if the grant says support for three grad students, you get three grad students, not two grad students and an engineer) has a huge influence on how labs are structured, including why grad students end up doing jobs engineers "should" be doing.
The majority of academic labs do not have the resources to offer the high salaries skilled coders demand and can easily get elsewhere.

And academic research often just needs to be “good enough” for that next paper or grant. Investing lots of time and people into code quality isn’t worthwhile unlesss your whole goal of the lab is to provide software as your output (there are a few, like the Wikipathways group). Bht for everyone else, the ROI is too low, better off working on the next grant or manuscript.

That’s definitely the perception, though I’m not sure how true it actually is. A battle-tested analysis pipeline or experimental control suite is a huge competitive advantage for a lab.

The catch is that it needs to both evolve and stay solid at the same time: it’s hard to predict what you’re going to want in three years—-or find the time to clean up code from the last three years, especially since many of those folks will have moved on.

It is an advantage but not crucial for grants/papers etc.
I agree that it doesn’t matter a whit for any single grant or paper, but one paper (or grant) rarely makes a career.

My claim is that people systematically underestimate the value of good code to a research program. Good infrastructure lets the lab focus on the scientific questions, rather than the logistics of moving and processing the data, which in turn allows them to publish more, better, and faster. This is true for a lot of things: some labs have fantastically good imaging pipelines, or have worked out how to rapidly train animals for certain behaviors, or can reliably do an assay that often fails in others’ hands, and derive a huge benefit from it. Some are so good that I wouldn’t even consider competing with them in their niche. My argument is that good code can also have returns like that.

As a personal example, my first paper at McGill took about three years to finish. The next took about a year and a half (and just came out). We’re on track to submit at least one—-and maybe as many as three—-papers this year. Some of this is due to practice, but a lot of it is due to the fact that we built reusable components instead of “the script that gives the numbers”