Hacker News new | ask | show | jobs
by elsbree 4606 days ago
Agreed, the current state of software tools for biology is sad- the tools are written by scientists, for scientists, and tend to have messy source code and incomplete/difficult to read documentation.

I don't mean to insult the people who work on the tools currently- they're great! But we need more software people writing tools for the industry.

Fortunately, people are starting to do just that. TeselaGen and Genome Compiler are both good examples. (Disclaimer- I'm a TeselaGen engineer)

2 comments

Genome compiler is not a good example. The fundamental premise is incorrect. Biology should not be done as a drag-and-drop exercise... Saving a few minutes of your time is not worth having blinders that increase the likelihood of huge errors in your design. Having an intimate and comprehensive knowledge of your sequence is critical, and having a casual knowledge can lead to disaster. This is not for just anyone, either, you also have to have instant, library recall of as much as possible.

To give an example, I once witnessed an algorithmic redesign effort completely miss two extragenic components in an essential gene that would likely not have been missed by an attentive human (or better yet two or three attentive humans). Luckily, the cells evolved their way around it, the researchers tracked down the problem and how the bugs solved it and the situation is interesting enough to possibly result in a publication.

It's not necessarily a people problem as much as an incentives problem - the incentives around Biology are, at present, entirely at odds with writing good software.

Clean, well-documented source code won't get you grants. It won't yield citations. It won't get you tenure. Beyond making sure you can run the same code again, and it works if the postdoc who wrote it leaves, everything else is under the "For the good of humanity" incentive structure. And with grant paylines in the middling single digits, its really hard not to triage good code in favor of making sure the lights stay on.

That's interesting. I think if programmers in the biology realm open sourced all their code, that might be incentive in itself to write good code. Once multiple people start maintaining a project there's inherent incentive to have nice code. In addition, there's a certain level of bragging rights of putting an awesome project on your CV and getting future jobs because of that codebase.

But it took years for your (now typical) OS, server, and Internet open source projects to reach maturity and figure out how they can be monetized.

People in the sciences should start blogging more. People like me find all of these subjects very interesting but very foreign. And I think many of us have grown a bit bored with where most programming efforts are directed (backoffice, ecommerce, and social apps).

Three thoughts on this:

1. Keep in mind for most projects and papers, not very many people are ever going to use the source code. For most projects, there's almost no chance that you're going to get a lively, multiple contributor project going. Odds are it's just going to be on your shoulders.

2. If you're going to stay in academia, there's no level of bragging rights to an awesome project, and it won't particularly help your job prospects - indeed from an opportunity cost perspective, most of the time it will hurt them. Once the code is good enough for a paper to be written, the incentive to do more work on the code vanishes.

3. Science blogging is actually a pretty active field. But talking about the software aspects of code don't get talked about as much because its just a tool. There are some blogs on software for science drifting around out there though.