| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chrisamiller 4889 days ago

Some thoughts on this article:

- This guy clearly has a limited understanding of the field. This quote is laughable: "There are only two computationally difficult problems in bioinformatics, sequence alignment and phylogenetic tree construction."

- As a bioinformatician, I feel sorry for this guy. Just like any other field, there are shitty places to work. If I was stuck in a lab where a demanding PI with no computer skills kept throwing the results of poorly designed experiments at me and asking for miracles, I'd be a little bitter too.

- Just like any other field, there are also lots of places that are great places to work and are churning out some pretty goddamn amazing code and science. I'm working in cancer genomics, and we've already done work where the results of our bioinformatic analyses have saved people's lives. Here's one high-profile example that got a lot of good press. (http://www.nytimes.com/2012/07/08/health/in-gene-sequencing-...)

- I'm in the field of bioinformatics to improve human health and understand deep biological questions. I care about reproducibility and accuracy in my code, but 90% of the time, I could give a rat's ass about performance. I'm trying to find the answer to a question, and if I can get that answer in a reasonable amount of time, then the code is good enough. This is especially true when you consider that 3/4 of the things I do are one-off analyses with code that will never be used again. (largely because 3/4 of experiments fail - science is messy and hard like that). If given a choice between dicking around for two weeks to make my code perfect, or cranking out something that works in 2 hours, I'll pretty much always choose the latter. ("Premature optimization is the root of all evil (or at least most of it) in programming." --Donald Knuth)

- That said, when we do come up with some useful and widely applicable code, we do our best to optimize it, put it into pipelines with robust testing, and open-source it, so that the community can use it. If his lab never did that, they're rapidly falling behind the rest of the field.

- As for his assertion that bad code and obscure file formats are job security through obscurity, I'm going to call bullshit. For many years, the field lacked people with real CS training, so you got a lot of biologists reading a perl book in their spare time and hacking together some ugly, but functional solutions. Sure, in some ways that was less than optimal, but hell, it got us the human genome. The field is beginning to mature, and you're starting to see better code and standard formats as more computationally-savvy people move in. No one will argue that things couldn't be improved, but attributing it to unethical behavior or malice is just ridiculous.

tl;dr: Bitter guy with some kind of bone to pick doesn't really understand or accurately depict the state of the field.

4 comments

moondowner 4888 days ago

" I could give a rat's ass about performance. I'm trying to find the answer to a question, and if I can get that answer in a reasonable amount of time, then the code is good enough"

This is the only bad point that a lot of people are aligned with.

The more time a program needs to finish, the more time you will need to run it again with some other dataset, and in turn - more time to find the right answer.

I really feel that people with scientific and mathematics background should learn proper programming (not take a course in some language - but have actual experience). Design patterns, data structures, best practices, memory consumption, are all things that should be known before a person starts submitting code for this kind of projects.