Hacker News new | ask | show | jobs
by etaoins 1622 days ago
Genomes are sequenced when they're both accessible and are of scientific/economic interest. This results in an overrepresentation of:

- Humans and their pathogens

- Economically important organisms (e.g. crops and farm animals) and their pathogens

- Model organisms

For example, compare the number of Genebank sequences for HIV-1 [1] versus Feline immunodeficiency virus [2]. HIV-1 is additionally problematic because it has an extremely high mutation rate [3], making it more likely for a random sequence to match some sequenced HIV-1 genome.

[1]: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mod...

[2]: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mod...

[3]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4574155/