|
|
|
|
|
by epistasis
4893 days ago
|
|
Pretty much anything that gets used by many people ends up getting polished (the exception being the RNA-seq field, it's still pretty rough out there, but the research is still taking quite a while). And if you're writing software, your tool isn't going to get used until it's somewhat polished, or is so unique and essential in its purpose that people have to use it. In terms of next-generation sequence analysis, Heng Li's BWA mapper and Samtools libraries are fairly good. His coding style is a bit terse for my tastes, but it keeps out people who don't know what they're doing, it's very clear code for semi-complicated algorithms, and BWA is some of the most reliable software I use everyday. On the infrastructure side, Galaxy [https://main.g2.bx.psu.edu] is getting fairly good. The BioConductor repository of R packages is extremely mixed. I don't like some of their architectural choices, but it's ended up working out OK. I still use Michael Eisen's Cluster from a decade ago, along with Java TreeView. |
|
"Look at the disgusting state of the samtools code base. Many more cycles are being used because people write garbage. For a tool that is intimately tied to research, the absence of associated code commentary and meaningful commit messages is very poor. The code itself is not well self documenting either."
commit log:
http://samtools.svn.sourceforge.net/viewvc/samtools/trunk/sa...
code:
http://samtools.svn.sourceforge.net/viewvc/samtools/trunk/sa...