|
|
|
|
|
by dannykwells
2593 days ago
|
|
The hardest part of genomics for me has honestly been figuring out which open source poorly maintained tool I should use for a particular problem. and which options should be run and how the data need to be preprocessed before hand. I mean has anyone ever actually read the documentation of the GATK? It is famously dreadful. And that's professionally maintained. Honestly a nice addition here would be a "so you want to" with snippets of raw FASTQ or VCF data and working code for various operations, maybe with an accompanying Docker container. |
|
My experience has been translating domain data into spark has a 100X improvement in data analysis.