Hacker News new | ask | show | jobs
by pickdenis 2592 days ago
On the practical side, if you're working on a low level with NGS data, htslib[1] may be worth looking into. It is a C library for reading, writing, and manipulating data structures that are commonly used in NGS (BAM, VCF, etc). I have used it and can attest to its quality. However, as is the issue with all software related to genomics, its only documentation is its header files and example programs. Here is the very example I used to get started[2]. The comments in the header files are usually good enough.

The reason I'm recommending it is the quality of its interfaces. It can seamlessly handle (input or output) virtually any kind of file you throw at it (SAM, BAM, CRAM). I can't say the same for a lot of other software I have run into in this space.

[1]: https://github.com/samtools/htslib

[2]: https://gist.github.com/PoisonAlien/350677acc03b2fbf98aa